MBCATs as modifiers of the beta-catenin pathway and methods of use

ABSTRACT

Human MBCAT genes are identified as modulators of the beta-catenin pathway, and thus are therapeutic targets for disorders associated with defective beta-catenin function. Methods for identifying modulators of beta-catenin, comprising screening for agents that modulate the activity of MBCAT are provided.

REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. provisional patentapplication No. 60/361,242 filed Mar. 1, 2002. The content of the priorapplication is hereby incorporated in its entirety.

BACKGROUND OF THE INVENTION

[0002] Beta-catenin is an adherens junction protein. Adherens junctions(AJs; also called the zonula adherens) are critical for theestablishment and maintenance of epithelial layers, such as those liningorgan surfaces. AJs mediate adhesion between cells, communicate a signalthat neighboring cells are present, and anchor the actin cytoskeleton.In serving these roles, AJs regulate normal cell growth and behavior. Atseveral stages of embryogenesis, wound healing, and tumor cellmetastasis, cells form and leave epithelia. This process, which involvesthe disruption and reestablishment of epithelial cell-cell contacts, maybe regulated by the disassembly and assembly of AJs. AJs may alsofunction in the transmission of the ‘contact inhibition’ signal, whichinstructs cells to stop dividing once an epithelial sheet is complete.

[0003] The AJ is a multiprotein complex assembled aroundcalcium-regulated cell adhesion molecules called cadherins (Peifer,M.(1993) Science 262: 1667-1668). Cadherins are transmembrane proteins:the extracellular domain mediates homotypic adhesion with cadherins onneighboring cells, and the intracellular domain interacts withcytoplasmic proteins that transmit the adhesion signal and anchor the AJto the actin cytoskeleton. These cytoplasmic proteins include thealpha-, beta-, and gamma-catenins. The beta-catenin protein shares 70%amino acid identity with both plakoglobin, which is found in desmosomes(another type of intracellular junction), and the product of theDrosophila segment polarity gene ‘armadillo’. Armadillo is part of amultiprotein AJ complex in Drosophila that also includes some homologsof alpha-catenin and cadherin, and genetic studies indicate that it isrequired for cell adhesion and cytoskeletal integrity.

[0004] Beta-catenin, in addition to its role as a cell adhesioncomponent, also functions as a transcriptional co-activator in the Wntsignaling pathway through its interactions with the family of Tcf andLef transcription factors (for a review see Polakis, (1999) CurrentOpinion in Genetics & Development, 9:15-21 and Gat U., et al., (1998)Cell 95:605-614).

[0005] The APC gene, which is mutant in adenomatous polyposis of thecolon, is a negative regulator of beta-catenin signaling (Korinek, V. etal., (1997) Science 275: 1784-1787; Morin, P. J., et al., (1997) Science275: 1787-1790). The APC protein normally binds to beta-catenin and, incombination with other proteins (including glycogen synthase kinase-3band axin, is required for the efficient degradation of b-catenin. Theregulation of beta-catenin is critical to the tumor suppressive effectof APC and that this regulation can be circumvented by mutations ineither APC or beta-catenin.

[0006] While mammals contain only a single beta-catenin gene, C. eleganscontains three (Korswagen H C, et al., (2000) Nature 406:527-32). Eachworm beta-catenin appears to carry out unique functions (Korswagen H C,et al., (2000) Nature 406:527-32, Nartarajan L et al. (2001) Genetics159: 159-72). Because of the divergence of function in C. elegans, it ispossible to specifically study beta-catenin role in cell adhesion, whichis mediated by the C. elegans beta-catenin HMP-2.

[0007] MAST205 is a protein with strong similarity to microtubuleassociated testis specific serine/threonine protein kinase (mouseMtssk), which may act in spermatid maturation and microtubuleorganization. SAST, KIAA0561, and KIAA0303 are similar to MAST205(Nagase T et al (1998) DNA Res 5:277-286; Walden P D and Millette C F(1996) Biol Reprod 55:1039-1044; Walden P D and Cowan N J (1993) molCell Biol 13:7625-7635).

[0008] The ability to manipulate the genomes of model organisms such asC. elegans provides a powerful means to analyze biochemical processesthat, due to significant evolutionary conservation, have directrelevance to more complex vertebrate organisms. Due to a high level ofgene and pathway conservation, the strong similarity of cellularprocesses, and the functional conservation of genes between these modelorganisms and mammals, identification of the involvement of novel genesin particular pathways and their functions in such model organisms candirectly contribute to the understanding of the correlative pathways andmethods of modulating them in mammals (see, for example, Dulubova I, etal, J Neurochem 2001 April;77(1):229-38; Cai T, et al., Diabetologia2001 January;44(1):81-8; Pasquinelli A E, et al., Nature. Nov. 2,2000;408(6808):37-8; Ivanov I P, et al., EMBO J Apr. 17,2000;19(8):1907-17; Vajo Z et al., Mamm Genome 1999October;10(10):1000-4). For example, a genetic screen can be carried outin an invertebrate model organism having underexpression (e.g. knockout)or overexpression of a gene (referred to as a “genetic entry point”)that yields a visible phenotype. Additional genes are mutated in arandom or targeted manner. When a gene mutation changes the originalphenotype caused by the mutation in the genetic entry point, the gene isidentified as a “modifier” involved in the same or overlapping pathwayas the genetic entry point. When the genetic entry point is an orthologof a human gene implicated in a disease pathway, such as beta-catenin,modifier genes can be identified that may be attractive candidatetargets for novel therapeutics.

[0009] All references cited herein, including patents, patentapplications, publications, and sequence information in referencedGenbank identifier numbers, are incorporated herein in their entireties.

SUMMARY OF THE INVENTION

[0010] We have discovered genes that modify the beta-catenin pathway inC. elegans, and identified their human orthologs, hereinafter referredto as modifier of beta-catenin (MBCAT). The invention provides methodsfor utilizing these beta-catenin modifier genes and polypeptides toidentify MBCAT-modulating agents that are candidate therapeutic agentsthat can be used in the treatment of disorders associated with defectiveor impaired beta-catenin function and/or MBCAT function. PreferredMBCAT-modulating agents specifically bind to MBCAT polypeptides andrestore beta-catenin function. Other preferred MBCAT-modulating agentsare nucleic acid modulators such as antisense oligomers and RNAi thatrepress MBCAT gene expression or product activity by, for example,binding to and inhibiting the respective nucleic acid (i.e. DNA ormRNA).

[0011] MBCAT modulating agents may be evaluated by any convenient invitro or in vivo assay for molecular interaction with an MBCATpolypeptide or nucleic acid. In one embodiment, candidate MBCATmodulating agents are tested with an assay system comprising a MBCATpolypeptide or nucleic acid. Agents that produce a change in theactivity of the assay system relative to controls are identified ascandidate beta-catenin modulating agents. The assay system may becell-based or cell-free. MBCAT-modulating agents include MBCAT relatedproteins (e.g. dominant negative mutants, and biotherapeutics);MBCAT-specific antibodies; MBCAT-specific antisense oligomers and othernucleic acid modulators; and chemical agents that specifically bind toor interact with MBCAT or compete with MBCAT binding partner (e.g. bybinding to an MBCAT binding partner). In one specific embodiment, asmall molecule modulator is identified using a kinase assay. In specificembodiments, the screening assay system is selected from a bindingassay, an apoptosis assay, a cell proliferation assay, an angiogenesisassay, and a hypoxic induction assay.

[0012] In another embodiment, candidate beta-catenin pathway modulatingagents are further tested using a second assay system that detectschanges in the beta-catenin pathway, such as angiogenic, apoptotic, orcell proliferation changes produced by the originally identifiedcandidate agent or an agent derived from the original agent. The secondassay system may use cultured cells or non-human animals. In specificembodiments, the secondary assay system uses non-human animals,including animals predetermined to have a disease or disorderimplicating the beta-catenin pathway, such as an angiogenic, apoptotic,or cell proliferation disorder (e.g. cancer).

[0013] The invention further provides methods for modulating the MBCATfunction and/or the beta-catenin pathway in a mammalian cell bycontacting the mammalian cell with an agent that specifically binds aMBCAT polypeptide or nucleic acid. The agent may be a small moleculemodulator, a nucleic acid modulator, or an antibody and may beadministered to a mammalian animal predetermined to have a pathologyassociated the beta-catenin pathway.

DETAILED DESCRIPTION OF THE INVENTION

[0014] Genetic screens were designed to identify modifiers of thebeta-catenin pathway in C. elegans. A weak allele of beta-catenin wasused in our screen (a homozygous viable mutant of beta-catenin, alleleqm39). The hmp-2 (qm-39) strain produces larval worms with a highlypenetrant lumpy body phenotype in first stage larval worms (L1s).Various specific genes were silenced by RNA inhibition (RNAi). Methodsfor using RNAi to silence genes in C. elegans are known in the art (FireA, et al., 1998 Nature 391:806-811; Fire, A. Trends Genet. 15, 358-363(1999); WO9932619). The C10C6.1 gene was identified as a modifier of thebeta-catenin pathway. Accordingly, vertebrate orthologs of the modifier,and preferably the human orthologs, MBCAT genes (i.e., nucleic acids andpolypeptides) are attractive drug targets for the treatment ofpathologies associated with a defective beta-catenin signaling pathway,such as cancer.

[0015] In vitro and in vivo methods of assessing MBCAT function areprovided herein. Modulation of the MBCAT or their respective bindingpartners is useful for understanding the association of the beta-cateninpathway and its members in normal and disease conditions and fordeveloping diagnostics and therapeutic modalities for beta-cateninrelated pathologies. MBCAT-modulating agents that act by inhibiting orenhancing MBCAT expression, directly or indirectly, for example, byaffecting an MBCAT function such as enzymatic (e.g., catalytic) orbinding activity, can be identified using methods provided herein. MBCATmodulating agents are useful in diagnosis, therapy and pharmaceuticaldevelopment.

[0016] Nucleic Acids and Polypeptides of the Invention

[0017] Sequences related to MBCAT nucleic acids and polypeptides thatcan be used in the invention are disclosed in Genbank (referenced byGenbank identifier (GI) number) as GI#s 14149670 (SEQ ID NO: 1), 3882334(SEQ ID NO:2), 10440171 (SEQ ID NO:3), 16198348 (SEQ ID NO:4), 16549992(SEQ ID NO:5), 21756034 (SEQ ID NO:6), 18600993 (SEQ ID NO:7), 22051641(SEQ ID NO:9), 17455843 (SEQ ID NO: 10), 18561839 (SEQ ID NO:11),13177744 (SEQ ID NO:12), and 2224546 (SEQ ID NO:13) for nucleic acid,and GI#s 14149671 (SEQ ID NO: 14), 14759411 (SEQ ID NO: 15), 17455844(SEQ ID NO: 16), 18561840 (SEQ ID NO: 17), and 27498257 (SEQ ID NO:18)for polypeptides. Additionally, nucleic acid sequence of SEQ ID NO:8 mayalso be used in the invention.

[0018] MBCATs are kinase proteins with protein kinase domains. The term“MBCAT polypeptide” refers to a full-length MBCAT protein or afunctionally active fragment or derivative thereof. A “functionallyactive” MBCAT fragment or derivative exhibits one or more functionalactivities associated with a full-length, wild-type MBCAT protein, suchas antigenic or immunogenic activity, enzymatic activity, ability tobind natural cellular substrates, etc. The functional activity of MBCATproteins, derivatives and fragments can be assayed by various methodsknown to one skilled in the art (Current Protocols in Protein Science(1998) Coligan et al., eds., John Wiley & Sons, Inc., Somerset, N.J.)and as further discussed below. In one embodiment, a functionally activeMBCAT polypeptide is a MBCAT derivative capable of rescuing defectiveendogenous MBCAT activity, such as in cell based or animal assays; therescuing derivative may be from the same or a different species. Forpurposes herein, functionally active fragments also include thosefragments that comprise one or more structural domains of an MBCAT, suchas a kinase domain or a binding domain. Protein domains can beidentified using the PFAM program (Bateman A., et al., Nucleic AcidsRes, 1999, 27:260-2). For example, the kinase domain of MBCAT from GI#s14149671, 14759411, 17455844, and 27498257 (SEQ ID NOs: 14, 15, 16, and18, respectively) are located respectively at approximately amino acidresidues 512 to 785, 460 to 547, 197 to 470, and 39 to 312 (PFAM 00069).Methods for obtaining MBCAT polypeptides are also further describedbelow. In some embodiments, preferred fragments are functionally active,domain-containing fragments comprising at least 25 contiguous aminoacids, preferably at least 50, more preferably 75, and most preferablyat least 100 contiguous amino acids of any one of SEQ ID NOs: 14-18 (anMBCAT). In further preferred embodiments, the fragment comprises theentire kinase (functionally active) domain.

[0019] The term “MBCAT nucleic acid” refers to a DNA or RNA moleculethat encodes a MBCAT polypeptide. Preferably, the MBCAT polypeptide ornucleic acid or fragment thereof is from a human, but can also be anortholog, or derivative thereof with at least 70% sequence identity,preferably at least 80%, more preferably 85%, still more preferably 90%,and most preferably at least 95% sequence identity with human MBCAT.Methods of identifying orthlogs are known in the art. Normally,orthologs in different species retain the same function, due to presenceof one or more protein motifs and/or 3-dimensional structures. Orthologsare generally identified by sequence homology analysis, such as BLASTanalysis, usually using protein bait sequences. Sequences are assignedas a potential ortholog if the best hit sequence from the forward BLASTresult retrieves the original query sequence in the reverse BLAST(Huynen M A and Bork P, Proc Natl Acad Sci (1998) 95:5849-5856; Huynen MA et al., Genome Research (2000) 10:1204-1210). Programs for multiplesequence alignment, such as CLUSTAL (Thompson J D et al, 1994, NucleicAcids Res 22:4673-4680) may be used to highlight conserved regionsand/or residues of orthologous proteins and to generate phylogenetictrees. In a phylogenetic tree representing multiple homologous sequencesfrom diverse species (e.g., retrieved through BLAST analysis),orthologous sequences from two species generally appear closest on thetree with respect to all other sequences from these two species.Structural threading or other analysis of protein folding (e.g., usingsoftware by ProCeryon, Biosciences, Salzburg, Austria) may also identifypotential orthologs. In evolution, when a gene duplication event followsspeciation, a single gene in one species, such as C. elegans, maycorrespond to multiple genes (paralogs) in another, such as human. Asused herein, the term “orthologs” encompasses paralogs. As used herein,“percent (%) sequence identity” with respect to a subject sequence, or aspecified portion of a subject sequence, is defined as the percentage ofnucleotides or amino acids in the candidate derivative sequenceidentical with the nucleotides or amino acids in the subject sequence(or specified portion thereof), after aligning the sequences andintroducing gaps, if necessary to achieve the maximum percent sequenceidentity, as generated by the program WU-BLAST-2.0a19 (Altschul et al.,J. Mol. Biol. (1997) 215:403-410) with all the search parameters set todefault values. The HSP S and HSP S2 parameters are dynamic values andare established by the program itself depending upon the composition ofthe particular sequence and composition of the particular databaseagainst which the sequence of interest is being searched. A % identityvalue is determined by the number of matching identical nucleotides oramino acids divided by the sequence length for which the percentidentity is being reported. “Percent (%) amino acid sequence similarity”is determined by doing the same calculation as for determining % aminoacid sequence identity, but including conservative amino acidsubstitutions in addition to identical amino acids in the computation.

[0020] A conservative amino acid substitution is one in which an aminoacid is substituted for another amino acid having similar propertiessuch that the folding or activity of the protein is not significantlyaffected. Aromatic amino acids that can be substituted for each otherare phenylalanine, tryptophan, and tyrosine; interchangeable hydrophobicamino acids are leucine, isoleucine, methionine, and valine;interchangeable polar amino acids are glutamine and asparagine;interchangeable basic amino acids are arginine, lysine and histidine;interchangeable acidic amino acids are aspartic acid and glutamic acid;and interchangeable small amino acids are alanine, serine, threonine,cysteine and glycine.

[0021] Alternatively, an alignment for nucleic acid sequences isprovided by the local homology algorithm of Smith and Waterman (Smithand Waterman, 1981, Advances in Applied Mathematics 2:482-489; database:European Bioinformatics Institute; Smith and Waterman, 1981, J. ofMolec. Biol., 147:195-197; Nicholas et al., 1998, “A Tutorial onSearching Sequence Databases and Sequence Scoring Methods” (www.psc.edu)and references cited therein.; W. R. Pearson, 1991, Genomics11:635-650). This algorithm can be applied to amino acid sequences byusing the scoring matrix developed by Dayhoff (Dayhoff: Atlas of ProteinSequences and Structure, M. 0. Dayhoff ed., 5 suppl. 3:353-358, NationalBiomedical Research Foundation, Washington, D.C., USA), and normalizedby Gribskov (Gribskov 1986 Nucl. Acids Res. 14(6):6745-6763). TheSmith-Waterman algorithm may be employed where default parameters areused for scoring (for example, gap open penalty of 12, gap extensionpenalty of two). From the data generated, the “Match” value reflects“sequence identity.”

[0022] Derivative nucleic acid molecules of the subject nucleic acidmolecules include sequences that hybridize to the nucleic acid sequenceof any of SEQ ID NOs: 14-18. The stringency of hybridization can becontrolled by temperature, ionic strength, pH, and the presence ofdenaturing agents such as formamide during hybridization and washing.Conditions routinely used are set out in readily available proceduretexts (e.g., Current Protocol in Molecular Biology, Vol. 1, Chap. 2.10,John Wiley & Sons, Publishers (1994); Sambrook et al., MolecularCloning, Cold Spring Harbor (1989)). In some embodiments, a nucleic acidmolecule of the invention is capable of hybridizing to a nucleic acidmolecule containing the nucleotide sequence of any one of SEQ ID NOs:1-13 under high stringency hybridization conditions that are:prehybridization of filters containing nucleic acid for 8 hours toovernight at 65° C. in a solution comprising 6× single strength citrate(SSC) (1×SSC is 0.15 M NaCl, 0.015 M Na citrate; pH 7.0), 5× Denhardt'ssolution, 0.05% sodium pyrophosphate and 100 μg/ml herring sperm DNA;hybridization for 18-20 hours at 65° C. in a solution containing 6×SSC,1× Denhardt's solution, 100 μg/ml yeast tRNA and 0.05% sodiumpyrophosphate; and washing of filters at 65° C. for 1 h in a solutioncontaining 0.1×SSC and 0.1% SDS (sodium dodecyl sulfate).

[0023] In other embodiments, moderately stringent hybridizationconditions are used that are: pretreatment of filters containing nucleicacid for 6 h at 40° C. in a solution containing 35% formamide, 5×SSC, 50mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500μg/ml denatured salmon sperm DNA; hybridization for 18-20 h at 40° C. ina solution containing 35% formamide, 5×SSC, 50 mM Tris-HCl (pH 7.5), 5mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 μg/ml salmon sperm DNA,and 10% (wt/vol) dextran sulfate; followed by washing twice for 1 hourat 55° C. in a solution containing 2×SSC and 0.1% SDS.

[0024] Alternatively, low stringency conditions can be used that are:incubation for 8 hours to overnight at 37° C. in a solution comprising20% formamide, 5×SSC, 50 mM sodium phosphate (pH 7.6), 5× Denhardt'ssolution, 10% dextran sulfate, and 20 μg/ml denatured sheared salmonsperm DNA; hybridization in the same buffer for 18 to 20 hours; andwashing of filters in 1×SSC at about 37° C. for 1 hour.

[0025] Isolation, Production, Expression, and Mis-Expression of MBCATNucleic Acids and Polypeptides

[0026] MBCAT nucleic acids and polypeptides, useful for identifying andtesting agents that modulate MBCAT function and for other applicationsrelated to the involvement of MBCAT in the beta-catenin pathway. MBCATnucleic acids and derivatives and orthologs thereof may be obtainedusing any available method. For instance, techniques for isolating cDNAor genomic DNA sequences of interest by screening DNA libraries or byusing polymerase chain reaction (PCR) are well known in the art. Ingeneral, the particular use for the protein will dictate the particularsof expression, production, and purification methods. For instance,production of proteins for use in screening for modulating agents mayrequire methods that preserve specific biological activities of theseproteins, whereas production of proteins for antibody generation mayrequire structural integrity of particular epitopes. Expression ofproteins to be purified for screening or antibody production may requirethe addition of specific tags (e.g., generation of fusion proteins).Overexpression of an MBCAT protein for assays used to assess MBCATfunction, such as involvement in cell cycle regulation or hypoxicresponse, may require expression in eukaryotic cell lines capable ofthese cellular activities. Techniques for the expression, production,and purification of proteins are well known in the art; any suitablemeans therefore may be used (e.g., Higgins S J and Hames B D (eds.)Protein Expression: A Practical Approach, Oxford University Press Inc.,New York 1999; Stanbury P F et al., Principles of FermentationTechnology, 2^(nd) edition, Elsevier Science, New York, 1995; Doonan S(ed.) Protein Purification Protocols, Humana Press, N.J., 1996; ColiganJ E et al, Current Protocols in Protein Science (eds.), 1999, John Wiley& Sons, New York). In particular embodiments, recombinant MBCAT isexpressed in a cell line known to have defective beta-catenin function.The recombinant cells are used in cell-based screening assay systems ofthe invention, as described further below.

[0027] The nucleotide sequence encoding an MBCAT polypeptide can beinserted into any appropriate expression vector. The necessarytranscriptional and translational signals, including promoter/enhancerelement, can derive from the native MBCAT gene and/or its flankingregions or can be heterologous. A variety of host-vector expressionsystems may be utilized, such as mammalian cell systems infected withvirus (e.g. vaccinia virus, adenovirus, etc.); insect cell systemsinfected with virus (e.g. baculovirus); microorganisms such as yeastcontaining yeast vectors, or bacteria transformed with bacteriophage,plasmid, or cosmid DNA. An isolated host cell strain that modulates theexpression of, modifies, and/or specifically processes the gene productmay be used.

[0028] To detect expression of the MBCAT gene product, the expressionvector can comprise a promoter operably linked to an MBCAT gene nucleicacid, one or more origins of replication, and, one or more selectablemarkers (e.g. thymidine kinase activity, resistance to antibiotics,etc.). Alternatively, recombinant expression vectors can be identifiedby assaying for the expression of the MBCAT gene product based on thephysical or functional properties of the MBCAT protein in in vitro assaysystems (e.g. immunoassays).

[0029] The MBCAT protein, fragment, or derivative may be optionallyexpressed as a fusion, or chimeric protein product (i.e. it is joinedvia a peptide bond to a heterologous protein sequence of a differentprotein), for example to facilitate purification or detection. Achimeric product can be made by ligating the appropriate nucleic acidsequences encoding the desired amino acid sequences to each other usingstandard methods and expressing the chimeric product. A chimeric productmay also be made by protein synthetic techniques, e.g. by use of apeptide synthesizer (Hunkapiller et al., Nature (1984) 310:105-111).

[0030] Once a recombinant cell that expresses the MBCAT gene sequence isidentified, the gene product can be isolated and purified using standardmethods (e.g. ion exchange, affinity, and gel exclusion chromatography;centrifugation; differential solubility; electrophoresis).Alternatively, native MBCAT proteins can be purified from naturalsources, by standard methods (e.g. immunoaffinity purification). Once aprotein is obtained, it may be quantified and its activity measured byappropriate methods, such as immunoassay, bioassay, or othermeasurements of physical properties, such as crystallography.

[0031] The methods of this invention may also use cells that have beenengineered for altered expression (mis-expression) of MBCAT or othergenes associated with the beta-catenin pathway. As used herein,mis-expression encompasses ectopic expression, over-expression,under-expression, and non-expression (e.g. by gene knock-out or blockingexpression that would otherwise normally occur).

[0032] Genetically Modified Animals

[0033] Animal models that have been genetically modified to alter MBCATexpression may be used in in vivo assays to test for activity of acandidate beta-catenin modulating agent, or to further assess the roleof MBCAT in a beta-catenin pathway process such as apoptosis or cellproliferation. Preferably, the altered MBCAT expression results in adetectable phenotype, such as decreased or increased levels of cellproliferation, angiogenesis, or apoptosis compared to control animalshaving normal MBCAT expression. The genetically modified animal mayadditionally have altered beta-catenin expression (e.g. beta-cateninknockout). Preferred genetically modified animals are mammals such asprimates, rodents (preferably mice or rats), among others. Preferrednon-mammalian species include zebrafish, C. elegans, and Drosophila.Preferred genetically modified animals are transgenic animals having aheterologous nucleic acid sequence present as an extrachromosomalelement in a portion of its cells, i.e. mosaic animals (see, forexample, techniques described by Jakobovits, 1994, Curr. Biol.4:761-763.) or stably integrated into its germ line DNA (i.e., in thegenomic sequence of most or all of its cells). Heterologous nucleic acidis introduced into the germ line of such transgenic animals by geneticmanipulation of, for example, embryos or embryonic stem cells of thehost animal.

[0034] Methods of making transgenic animals are well-known in the art(for transgenic mice see Brinster et al., Proc. Nat. Acad. Sci. USA 82:4438-4442 (1985), U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Lederet al., U.S. Pat. No. 4,873,191 by Wagner et al., and Hogan, B.,Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y., (1986); for particle bombardment see U.S. Pat. No.,4,945,050, by Sandford et al.; for transgenic Drosophila see Rubin andSpradling, Science (1982) 218:348-53 and U.S. Pat. No. 4,670,388; fortransgenic insects see Berghammer A. J. et al., A Universal Marker forTransgenic Insects (1999) Nature 402:370-371; for transgenic Zebrafishsee Lin S., Transgenic Zebrafish, Methods Mol Biol.(2000);136:375-3830); for microinjection procedures for fish, amphibianeggs and birds see Houdebine and Chourrout, Experientia (1991)47:897-905; for transgenic rats see Hammer et al., Cell (1990)63:1099-1112; and for culturing of embryonic stem (ES) cells and thesubsequent production of transgenic animals by the introduction of DNAinto ES cells using methods such as electroporation, calciumphosphate/DNA precipitation and direct injection see, e.g.,Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J.Robertson, ed., IRL Press (1987)). Clones of the nonhuman transgenicanimals can be produced according to available methods (see Wilmut, I.et al. (1997) Nature 385:810-813; and PCT International Publication Nos.WO 97/07668 and WO 97/07669).

[0035] In one embodiment, the transgenic animal is a “knock-out” animalhaving a heterozygous or homozygous alteration in the sequence of anendogenous MBCAT gene that results in a decrease of MBCAT function,preferably such that MBCAT expression is undetectable or insignificant.Knock-out animals are typically generated by homologous recombinationwith a vector comprising a transgene having at least a portion of thegene to be knocked out. Typically a deletion, addition or substitutionhas been introduced into the transgene to functionally disrupt it. Thetransgene can be a human gene (e.g., from a human genomic clone) butmore preferably is an ortholog of the human gene derived from thetransgenic host species. For example, a mouse MBCAT gene is used toconstruct a homologous recombination vector suitable for altering anendogenous MBCAT gene in the mouse genome. Detailed methodologies forhomologous recombination in mice are available (see Capecchi, Science(1989) 244:1288-1292; Joyner et al., Nature (1989) 338:153-156).Procedures for the production of non-rodent transgenic mammals and otheranimals are also available (Houdebine and Chourrout, supra; Pursel etal., Science (1989) 244:1281-1288; Simms et al., Bio/Technology (1988)6:179-183). In a preferred embodiment, knock-out animals, such as miceharboring a knockout of a specific gene, may be used to produceantibodies against the human counterpart of the gene that has beenknocked out (Claesson M H et al., (1994) Scan J Immunol 40:257-264;Declerck P J et al., (1995) J Biol Chem. 270:8397-400).

[0036] In another embodiment, the transgenic animal is a “knock-in”animal having an alteration in its genome that results in alteredexpression (e.g., increased (including ectopic) or decreased expression)of the MBCAT gene, e.g., by introduction of additional copies of MBCAT,or by operatively inserting a regulatory sequence that provides foraltered expression of an endogenous copy of the MBCAT gene. Suchregulatory sequences include inducible, tissue-specific, andconstitutive promoters and enhancer elements. The knock-in can behomozygous or heterozygous.

[0037] Transgenic nonhuman animals can also be produced that containselected systems allowing for regulated expression of the transgene. Oneexample of such a system that may be produced is the cre/loxPrecombinase system of bacteriophage P1 (Lakso et al., PNAS (1992)89:6232-6236; U.S. Pat. No. 4,959,317). If a cre/loxP recombinase systemis used to regulate expression of the transgene, animals containingtransgenes encoding both the Cre recombinase and a selected protein arerequired. Such animals can be provided through the construction of“double” transgenic animals, e.g., by mating two transgenic animals, onecontaining a transgene encoding a selected protein and the othercontaining a transgene encoding a recombinase. Another example of arecombinase system is the FLP recombinase system of Saccharomycescerevisiae (O'Gorman et al. (1991) Science 251:1351-1355; U.S. Pat. No.5,654,182). In a preferred embodiment, both Cre-LoxP and Flp-Frt areused in the same system to regulate expression of the transgene, and forsequential deletion of vector sequences in the same cell (Sun X et al(2000) Nat Genet 25:83-6).

[0038] The genetically modified animals can be used in genetic studiesto further elucidate the beta-catenin pathway, as animal models ofdisease and disorders implicating defective beta-catenin function, andfor in vivo testing of candidate therapeutic agents, such as thoseidentified in screens described below. The candidate therapeutic agentsare administered to a genetically modified animal having altered MBCATfunction and phenotypic changes are compared with appropriate controlanimals such as genetically modified animals that receive placebotreatment, and/or animals with unaltered MBCAT expression that receivecandidate therapeutic agent.

[0039] In addition to the above-described genetically modified animalshaving altered MBCAT function, animal models having defectivebeta-catenin function (and otherwise normal MBCAT function), can be usedin the methods of the present invention. Preferably, the candidatebeta-catenin modulating agent when administered to a model system withcells defective in beta-catenin function, produces a detectablephenotypic change in the model system indicating that the beta-cateninfunction is restored, i.e., the cells exhibit normal cell cycleprogression.

[0040] Modulating Agents

[0041] The invention provides methods to identify agents that interactwith and/or modulate the function of MBCAT and/or the beta-cateninpathway. Modulating agents identified by the methods are also part ofthe invention. Such agents are useful in a variety of diagnostic andtherapeutic applications associated with the beta-catenin pathway, aswell as in further analysis of the MBCAT protein and its contribution tothe beta-catenin pathway. Accordingly, the invention also providesmethods for modulating the beta-catenin pathway comprising the step ofspecifically modulating MBCAT activity by administering aMBCAT-interacting or -modulating agent.

[0042] As used herein, an “MBCAT-modulating agent” is any agent thatmodulates MBCAT function, for example, an agent that interacts withMBCAT to inhibit or enhance MBCAT activity or otherwise affect normalMBCAT function. MBCAT function can be affected at any level, includingtranscription, protein expression, protein localization, and cellular orextra-cellular activity. In a preferred embodiment, the MBCAT-modulatingagent specifically modulates the function of the MBCAT. The phrases“specific modulating agent”, “specifically modulates”, etc., are usedherein to refer to modulating agents that directly bind to the MBCATpolypeptide or nucleic acid, and preferably inhibit, enhance, orotherwise alter, the function of the MBCAT. These phrases also encompassmodulating agents that alter the interaction of the MBCAT with a bindingpartner, substrate, or cofactor (e.g. by binding to a binding partner ofan MBCAT, or to a protein/binding partner complex, and altering MBCATfunction). In a further preferred embodiment, the MBCAT-modulating agentis a modulator of the beta-catenin pathway (e.g. it restores and/orupregulates beta-catenin function) and thus is also abeta-catenin-modulating agent.

[0043] Preferred MBCAT-modulating agents include small moleculecompounds; MBCAT-interacting proteins, including antibodies and otherbiotherapeutics; and nucleic acid modulators such as antisense and RNAinhibitors. The modulating agents may be formulated in pharmaceuticalcompositions, for example, as compositions that may comprise otheractive ingredients, as in combination therapy, and/or suitable carriersor excipients. Techniques for formulation and administration of thecompounds may be found in “Remington's Pharmaceutical Sciences” MackPublishing Co., Easton, Pa., 19^(th) edition.

[0044] Small Molecule Modulators

[0045] Small molecules are often preferred to modulate function ofproteins with enzymatic function, and/or containing protein interactiondomains. Chemical agents, referred to in the art as “small molecule”compounds are typically organic, non-peptide molecules, having amolecular weight less than 10,000, preferably less than 5,000, morepreferably less than 1,000, and most preferably less than 500. Thisclass of modulators includes chemically synthesized molecules, forinstance, compounds from combinatorial chemical libraries. Syntheticcompounds may be rationally designed or identified based on known orinferred properties of the MBCAT protein or may be identified byscreening compound libraries. Alternative appropriate modulators of thisclass are natural products, particularly secondary metabolites fromorganisms such as plants or fungi, which can also be identified byscreening compound libraries for MBCAT-modulating activity. Methods forgenerating and obtaining compounds are well known in the art (SchreiberS L, Science (2000) 151: 1964-1969; Radmann J and Gunther J, Science(2000) 151:1947-1948).

[0046] Small molecule modulators identified from screening assays, asdescribed below, can be used as lead compounds from which candidateclinical compounds may be designed, optimized, and synthesized. Suchclinical compounds may have utility in treating pathologies associatedwith the beta-catenin pathway. The activity of candidate small moleculemodulating agents may be improved several-fold through iterativesecondary functional validation, as further described below, structuredetermination, and candidate modulator modification and testing.Additionally, candidate clinical compounds are generated with specificregard to clinical and pharmacological properties. For example, thereagents may be derivatized and re-screened using in vitro and in vivoassays to optimize activity and minimize toxicity for pharmaceuticaldevelopment.

[0047] Protein Modulators

[0048] Specific MBCAT-interacting proteins are useful in a variety ofdiagnostic and therapeutic applications related to the beta-cateninpathway and related disorders, as well as in validation assays for otherMBCAT-modulating agents. In a preferred embodiment, MBCAT-interactingproteins affect normal MBCAT function, including transcription, proteinexpression, protein localization, and cellular or extra-cellularactivity. In another embodiment, MBCAT-interacting proteins are usefulin detecting and providing information about the function of MBCATproteins, as is relevant to beta-catenin related disorders, such ascancer (e.g., for diagnostic means).

[0049] An MBCAT-interacting protein may be endogenous, i.e. one thatnaturally interacts genetically or biochemically with an MBCAT, such asa member of the MBCAT pathway that modulates MBCAT expression,localization, and/or activity. MBCAT-modulators include dominantnegative forms of MBCAT-interacting proteins and of MBCAT proteinsthemselves. Yeast two-hybrid and variant screens offer preferred methodsfor identifying endogenous MBCAT-interacting proteins (Finley, R. L. etal. (1996) in DNA Cloning-Expression Systems: A Practical Approach, eds.Glover D. & Hames B. D (Oxford University Press, Oxford, England), pp.169-203; Fashema SF et al., Gene (2000) 250:1-14; Drees B L Curr OpinChem Biol (1999) 3:64-70; Vidal M and Legrain P Nucleic Acids Res (1999)27:919-29; and U.S. Pat. No. 5,928,868). Mass spectrometry is analternative preferred method for the elucidation of protein complexes(reviewed in, e.g., Pandley A and Mann M, Nature (2000) 405:837-846;Yates JR 3^(rd), Trends Genet (2000) 16:5-8).

[0050] An MBCAT-interacting protein may be an exogenous protein, such asan MBCAT-specific antibody or a T-cell antigen receptor (see, e.g.,Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold SpringHarbor Laboratory; Harlow and Lane (1999) Using antibodies: a laboratorymanual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press).MBCAT antibodies are further discussed below.

[0051] In preferred embodiments, an MBCAT-interacting proteinspecifically binds an MBCAT protein. In alternative preferredembodiments, an MBCAT-modulating agent binds an MBCAT substrate, bindingpartner, or cofactor.

[0052] Antibodies

[0053] In another embodiment, the protein modulator is an MBCAT specificantibody agonist or antagonist. The antibodies have therapeutic anddiagnostic utilities, and can be used in screening assays to identifyMBCAT modulators. The antibodies can also be used in dissecting theportions of the MBCAT pathway responsible for various cellular responsesand in the general processing and maturation of the MBCAT.

[0054] Antibodies that specifically bind MBCAT polypeptides can begenerated using known methods. Preferably the antibody is specific to amammalian ortholog of MBCAT polypeptide, and more preferably, to humanMBCAT. Antibodies may be polyclonal, monoclonal (mAbs), humanized orchimeric antibodies, single chain antibodies, Fab fragments,F(ab′).sub.2 fragments, fragments produced by a FAb expression library,anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments ofany of the above. Epitopes of M1BCAT which are particularly antigeniccan be selected, for example, by routine screening of MBCAT polypeptidesfor antigenicity or by applying a theoretical method for selectingantigenic regions of a protein (Hopp and Wood (1981), Proc. Natl. Acad.Sci. U.S.A. 78:3824-28; Hopp and Wood, (1983) Mol. Immunol. 20:483-89;Sutcliffe et al., (1983) Science 219:660-66) to the amino acid sequenceof any of SEQ ID NOs: 14-18. Monoclonal antibodies with affinities of10⁸ M⁻¹ preferably 10⁹ M⁻¹ to 10¹⁰ M⁻¹, or stronger can be made bystandard procedures as described (Harlow and Lane, supra; Goding (1986)Monoclonal Antibodies: Principles and Practice (2d ed) Academic Press,New York; and U.S. Pat. Nos. 4,381,292; 4,451,570; and 4,618,577).Antibodies may be generated against crude cell extracts of MBCAT orsubstantially purified fragments thereof. If MBCAT fragments are used,they preferably comprise at least 10, and more preferably, at least 20contiguous amino acids of an MBCAT protein. In a particular embodiment,MBCAT-specific antigens and/or immunogens are coupled to carrierproteins that stimulate the immune response. For example, the subjectpolypeptides are covalently coupled to the keyhole limpet hemocyanin(KLH) carrier, and the conjugate is emulsified in Freund's completeadjuvant, which enhances the immune response. An appropriate immunesystem such as a laboratory rabbit or mouse is immunized according toconventional protocols.

[0055] The presence of MBCAT-specific antibodies is assayed by anappropriate assay such as a solid phase enzyme-linked immunosorbantassay (ELISA) using immobilized corresponding MBCAT polypeptides. Otherassays, such as radioimmunoassays or fluorescent assays might also beused.

[0056] Chimeric antibodies specific to MBCAT polypeptides can be madethat contain different portions from different animal species. Forinstance, a human immunoglobulin constant region may be linked to avariable region of a murine mAb, such that the antibody derives itsbiological activity from the human antibody, and its binding specificityfrom the murine fragment. Chimeric antibodies are produced by splicingtogether genes that encode the appropriate regions from each species(Morrison et al., Proc. Natl. Acad. Sci. (1984) 81:6851-6855; Neubergeret al., Nature (1984) 312:604-608; Takeda et al., Nature (1985)31:452-454). Humanized antibodies, which are a form of chimericantibodies, can be generated by grafting complementary-determiningregions (CDRs) (Carlos, T. M., J. M. Harlan. 1994. Blood 84:2068-2101)of mouse antibodies into a background of human framework regions andconstant regions by recombinant DNA technology (Riechmann L M, et al.,1988 Nature 323: 323-327). Humanized antibodies contain ˜10% murinesequences and ˜90% human sequences, and thus further reduce or eliminateimmunogenicity, while retaining the antibody specificities (Co MS, andQueen C. 1991 Nature 351: 501-501; Morrison S L. 1992 Ann. Rev. Immun.10:239-265). Humanized antibodies and methods of their production arewell-known in the art (U.S. Pat. Nos. 5,530,101, 5,585,089, 5,693,762,and 6,180,370).

[0057] MBCAT-specific single chain antibodies which are recombinant,single chain polypeptides formed by linking the heavy and light chainfragments of the Fv regions via an amino acid bridge, can be produced bymethods known in the art (U.S. Pat. No. 4,946,778; Bird, Science (1988)242:423-426; Huston et al., Proc. Natl. Acad. Sci. USA (1988)85:5879-5883; and Ward et al., Nature (1989) 334:544-546).

[0058] Other suitable techniques for antibody production involve invitro exposure of lymphocytes to the antigenic polypeptides oralternatively to selection of libraries of antibodies in phage orsimilar vectors (Huse et al., Science (1989) 246:1275-1281). As usedherein, T-cell antigen receptors are included within the scope ofantibody modulators (Harlow and Lane, 1988, supra).

[0059] The polypeptides and antibodies of the present invention may beused with or without modification. Frequently, antibodies will belabeled by joining, either covalently or non-covalently, a substancethat provides for a detectable signal, or that is toxic to cells thatexpress the targeted protein (Menard S, et al., Int J. Biol Markers(1989) 4:131-134). A wide variety of labels and conjugation techniquesare known and are reported extensively in both the scientific and patentliterature. Suitable labels include radionuclides, enzymes, substrates,cofactors, inhibitors, fluorescent moieties, fluorescent emittinglanthanide metals, chemiluminescent moieties, bioluminescent moieties,magnetic particles, and the like (U.S. Pat. Nos. 3,817,837; 3,850,752;3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241). Also,recombinant immunoglobulins may be produced (U.S. Pat. No. 4,816,567).Antibodies to cytoplasmic polypeptides may be delivered and reach theirtargets by conjugation with membrane-penetrating toxin proteins (U.S.Pat. No. 6,086,900).

[0060] When used therapeutically in a patient, the antibodies of thesubject invention are typically administered parenterally, when possibleat the target site, or intravenously. The therapeutically effective doseand dosage regimen is determined by clinical studies. Typically, theamount of antibody administered is in the range of about 0.1 mg/kg-toabout 10 mg/kg of patient weight. For parenteral administration, theantibodies are formulated in a unit dosage injectable form (e.g.,solution, suspension, emulsion) in association with a pharmaceuticallyacceptable vehicle. Such vehicles are inherently nontoxic andnon-therapeutic. Examples are water, saline, Ringer's solution, dextrosesolution, and 5% human serum albumin. Nonaqueous vehicles such as fixedoils, ethyl oleate, or liposome carriers may also be used. The vehiclemay contain minor amounts of additives, such as buffers andpreservatives, which enhance isotonicity and chemical stability orotherwise enhance therapeutic potential. The antibodies' concentrationsin such vehicles are typically in the range of about 1 mg/ml to about 10mg/ml. Immunotherapeutic methods are further described in the literature(U.S. Pat. No. 5,859,206; WO0073469).

[0061] Nucleic Acid Modulators

[0062] Other preferred MBCAT-modulating agents comprise nucleic acidmolecules, such as antisense oligomers or double stranded RNA (dsRNA),which generally inhibit MBCAT activity. Preferred nucleic acidmodulators interfere with the function of the MBCAT nucleic acid such asDNA replication, transcription, translocation of the MBCAT RNA to thesite of protein translation, translation of protein from the MBCAT RNA,splicing of the MBCAT RNA to yield one or more mRNA species, orcatalytic activity which may be engaged in or facilitated by the MBCATRNA.

[0063] In one embodiment, the antisense oligomer is an oligonucleotidethat is sufficiently complementary to an MBCAT mRNA to bind to andprevent translation, preferably by binding to the 5′ untranslatedregion. MBCAT-specific antisense oligonucleotides, preferably range fromat least 6 to about 200 nucleotides. In some embodiments theoligonucleotide is preferably at least 10, 15, or 20 nucleotides inlength. In other embodiments, the oligonucleotide is preferably lessthan 50, 40, or 30 nucleotides in length. The oligonucleotide can be DNAor RNA or a chimeric mixture or derivatives or modified versionsthereof, single-stranded or double-stranded. The oligonucleotide can bemodified at the base moiety, sugar moiety, or phosphate backbone. Theoligonucleotide may include other appending groups such as peptides,agents that facilitate transport across the cell membrane,hybridization-triggered cleavage agents, and intercalating agents.

[0064] In another embodiment, the antisense oligomer is a phosphothioatemorpholino oligomer (PMO). PMOs are assembled from four differentmorpholino subunits, each of which contain one of four genetic bases (A,C, G, or T) linked to a six-membered morpholine ring. Polymers of thesesubunits are joined by non-ionic phosphodiamidate intersubunit linkages.Details of how to make and use PMOs and other antisense oligomers arewell known in the art (e.g. see WO99/18193; Probst J C, AntisenseOligodeoxynucleotide and Ribozyme Design, Methods. (2000) 22(3):271-281;Summerton J, and Weller D. 1997 Antisense Nucleic Acid DrugDev.:7:187-95; U.S. Pat. No. 5,235,033; and U.S. Pat. No. 5,378,841).

[0065] Alternative preferred MBCAT nucleic acid modulators aredouble-stranded RNA species mediating RNA interference (RNAi). RNAi isthe process of sequence-specific, post-transcriptional gene silencing inanimals and plants, initiated by double-stranded RNA (dsRNA) that ishomologous in sequence to the silenced gene. Methods relating to the useof RNAi to silence genes in C. elegans, Drosophila, plants, and humansare known in the art (Fire A, et al., 1998 Nature 391:806-811; Fire, A.Trends Genet. 15, 358-363 (1999); Sharp, P. A. RNA interference 2001.Genes Dev. 15, 485-490 (2001); Hammond, S. M., et al., Nature Rev.Genet. 2, 110-1119 (2001); Tuschl, T. Chem. Biochem. 2, 239-245 (2001);Hamilton, A. et al., Science 286, 950-952 (1999); Hammond, S. M., etal., Nature 404, 293-296 (2000); Zamore, P. D., et al., Cell 101, 25-33(2000); Bernstein, E., et al., Nature 409, 363-366 (2001); Elbashir, S.M., et al., Genes Dev. 15, 188-200 (2001); WO0129058; WO9932619;Elbashir S M, et al., 2001 Nature 411:494-498).

[0066] Nucleic acid modulators are commonly used as research reagents,diagnostics, and therapeutics. For example, antisense oligonucleotides,which are able to inhibit gene expression with exquisite specificity,are often used to elucidate the function of particular genes (see, forexample, U.S. Pat. No. 6,165,790). Nucleic acid modulators are alsoused, for example, to distinguish between functions of various membersof a biological pathway. For example, antisense oligomers have beenemployed as therapeutic moieties in the treatment of disease states inanimals and man and have been demonstrated in numerous clinical trialsto be safe and effective (Milligan J F, et al, Current Concepts inAntisense Drug Design, J Med Chem. (1993) 36:1923-1937; Tonkinson J L etal., Antisense Oligodeoxynucleotides as Clinical Therapeutic Agents,Cancer Invest. (1996) 14:54-65). Accordingly, in one aspect of theinvention, an MBCAT-specific nucleic acid modulator is used in an assayto further elucidate the role of the MBCAT in the beta-catenin pathway,and/or its relationship to other members of the pathway. In anotheraspect of the invention, an MBCAT-specific antisense oligomer is used asa therapeutic agent for treatment of beta-catenin-related diseasestates.

[0067] Assay Systems

[0068] The invention provides assay systems and screening methods foridentifying specific modulators of MBCAT activity. As used herein, an“assay system” encompasses all the components required for performingand analyzing results of an assay that detects and/or measures aparticular event. In general, primary assays are used to identify orconfirm a modulator's specific biochemical or molecular effect withrespect to the MBCAT nucleic acid or protein. In general, secondaryassays further assess the activity of a MBCAT modulating agentidentified by a primary assay and may confirm that the modulating agentaffects MBCAT in a manner relevant to the beta-catenin pathway. In somecases, MBCAT modulators will be directly tested in a secondary assay.

[0069] In a preferred embodiment, the screening method comprisescontacting a suitable assay system comprising an MBCAT polypeptide ornucleic acid with a candidate agent under conditions whereby, but forthe presence of the agent, the system provides a reference activity(e.g. kinase activity), which is based on the particular molecular eventthe screening method detects. A statistically significant differencebetween the agent-biased activity and the reference activity indicatesthat the candidate agent modulates MBCAT activity, and hence thebeta-catenin pathway. The MBCAT polypeptide or nucleic acid used in theassay may comprise any of the nucleic acids or polypeptides describedabove.

[0070] Primary Assays

[0071] The type of modulator tested generally determines the type ofprimary assay.

[0072] Primary Assays for Small Molecule Modulators

[0073] For small molecule modulators, screening assays are used toidentify candidate modulators. Screening assays may be cell-based or mayuse a cell-free system that recreates or retains the relevantbiochemical reaction of the target protein (reviewed in Sittampalam G Set al., Curr Opin Chem Biol (1997) 1:384-91 and accompanyingreferences). As used herein the term “cell-based” refers to assays usinglive cells, dead cells, or a particular cellular fraction, such as amembrane, endoplasmic reticulum, or mitochondrial fraction. The term“cell free” encompasses assays using substantially purified protein(either endogenous or recombinantly produced), partially purified orcrude cellular extracts. Screening assays may detect a variety ofmolecular events, including protein-DNA interactions, protein-proteininteractions (e.g., receptor-ligand binding), transcriptional activity(e.g., using a reporter gene), enzymatic activity (e.g., via a propertyof the substrate), activity of second messengers, immunogenicty andchanges in cellular morphology or other cellular characteristics.Appropriate screening assays may use a wide range of detection methodsincluding fluorescent, radioactive, colorimetric, spectrophotometric,and amperometric methods, to provide a read-out for the particularmolecular event detected.

[0074] Cell-based screening assays usually require systems forrecombinant expression of MBCAT and any auxiliary proteins demanded bythe particular assay. Appropriate methods for generating recombinantproteins produce sufficient quantities of proteins that retain theirrelevant biological activities and are of sufficient purity to optimizeactivity and assure assay reproducibility. Yeast two-hybrid and variantscreens, and mass spectrometry provide preferred methods for determiningprotein-protein interactions and elucidation of protein complexes. Incertain applications, when MBCAT-interacting proteins are used inscreens to identify small molecule modulators, the binding specificityof the interacting protein to the MBCAT protein may be assayed byvarious known methods such as substrate processing (e.g. ability of thecandidate MBCAT-specific binding agents to function as negativeeffectors in MBCAT-expressing cells), binding equilibrium constants(usually at least about 10⁷ M⁻¹, preferably at least about 10⁸ M⁻¹, morepreferably at least about 10⁹ M⁻¹), and immunogenicity (e.g. ability toelicit MBCAT specific antibody in a heterologous host such as a mouse,rat, goat or rabbit). For enzymes and receptors, binding may be assayedby, respectively, substrate and ligand processing.

[0075] The screening assay may measure a candidate agent's ability tospecifically bind to or modulate activity of a MBCAT polypeptide, afusion protein thereof, or to cells or membranes bearing the polypeptideor fusion protein. The MBCAT polypeptide can be full length or afragment thereof that retains functional MBCAT activity. The MBCATpolypeptide may be fused to another polypeptide, such as a peptide tagfor detection or anchoring, or to another tag. The MBCAT polypeptide ispreferably human MBCAT, or is an ortholog or derivative thereof asdescribed above. In a preferred embodiment, the screening assay detectscandidate agent-based modulation of MBCAT interaction with a bindingtarget, such as an endogenous or exogenous protein or other substratethat has MBCAT-specific binding activity, and can be used to assessnormal MBCAT gene function.

[0076] Suitable assay formats that may be adapted to screen for MBCATmodulators are known in the art. Preferred screening assays are highthroughput or ultra high throughput and thus provide automated,cost-effective means of screening compound libraries for lead compounds(Fernandes P B, Curr Opin Chem Biol (1998) 2:597-603; Sundberg S A, CurrOpin Biotechnol 2000, 11:47-53). In one preferred embodiment, screeningassays uses fluorescence technologies, including fluorescencepolarization, time-resolved fluorescence, and fluorescence resonanceenergy transfer. These systems offer means to monitor protein-protein orDNA-protein interactions in which the intensity of the signal emittedfrom dye-labeled molecules depends upon their interactions with partnermolecules (e.g., Selvin P R, Nat Struct Biol (2000) 7:730-4; Fernandes PB, supra; Hertzberg R P and Pope A J, Curr Opin Chem Biol (2000)4:445-451).

[0077] A variety of suitable assay systems may be used to identifycandidate MBCAT and beta-catenin pathway modulators (e.g. U.S. Pat. No.6,165,992 (kinase assays); U.S. Pat. Nos. 5,550,019 and 6,133,437(apoptosis assays); and U.S. Pat. Nos. 5,976,782, 6,225,118 and6,444,434 (angiogenesis assays), among others). Specific preferredassays are described in more detail below.

[0078] Kinase assays. In some preferred embodiments the screening assaydetects the ability of the test agent to modulate the kinase activity ofan MBCAT polypeptide. In further embodiments, a cell-free kinase assaysystem is used to identify a candidate beta-catenin modulating agent,and a secondary, cell-based assay, such as an apoptosis or hypoxicinduction assay (described below), may be used to further characterizethe candidate beta-catenin modulating agent. Many different assays forkinases have been reported in the literature and are well known to thoseskilled in the art (e.g. U.S. Pat. No. 6,165,992; Zhu et al., NatureGenetics (2000) 26:283-289; and WO0073469). Radioassays, which monitorthe transfer of a gamma phosphate are frequently used. For instance, ascintillation assay for p56 (lck) kinase activity monitors the transferof the gamma phosphate from gamma-³³P ATP to a biotinylated peptidesubstrate; the substrate is captured on a streptavidin coated bead thattransmits the signal (Beveridge M et al., J Biomol Screen (2000)5:205-212). This assay uses the scintillation proximity assay (SPA), inwhich only radio-ligand bound to receptors tethered to the surface of anSPA bead are detected by the scintillant immobilized within it, allowingbinding to be measured without separation of bound from free ligand.

[0079] Other assays for protein kinase activity may use antibodies thatspecifically recognize phosphorylated substrates. For instance, thekinase receptor activation (KIRA) assay measures receptor tyrosinekinase activity by ligand stimulating the intact receptor in culturedcells, then capturing solubilized receptor with specific antibodies andquantifying phosphorylation via phosphotyrosine ELISA (Sadick M D, DevBiol Stand (1999) 97:121-133).

[0080] Another example of antibody based assays for protein kinaseactivity is TRF (time-resolved fluorometry). This method utilizeseuropium chelate-labeled anti-phosphotyrosine antibodies to detectphosphate transfer to a polymeric substrate coated onto microtiter platewells. The amount of phosphorylation is then detected usingtime-resolved, dissociation-enhanced fluorescence (Braunwalder A F, etal., Anal Biochem Jul. 1, 1996;238(2):159-64).

[0081] Apoptosis assays. Assays for apoptosis may be performed byterminal deoxynucleotidyl transferase-mediated digoxigenin-11-dUTP nickend labeling (TUNEL) assay. The TUNEL assay is used to measure nuclearDNA fragmentation characteristic of apoptosis (Lazebnik et al., 1994,Nature 371, 346), by following the incorporation of fluorescein-dUTP(Yonehara et al., 1989, J. Exp. Med. 169, 1747). Apoptosis may furtherbe assayed by acridine orange staining of tissue culture cells (Lucas,R., et al., 1998, Blood 15:4730-41). An apoptosis assay system maycomprise a cell that expresses an MBCAT, and that optionally hasdefective beta-catenin function (e.g. beta-catenin is over-expressed orunder-expressed relative to wild-type cells). A test agent can be addedto the apoptosis assay system and changes in induction of apoptosisrelative to controls where no test agent is added, identify candidatebeta-catenin modulating agents. In some embodiments of the invention, anapoptosis assay may be used as a secondary assay to test a candidatebeta-catenin modulating agents that is initially identified using acell-free assay system. An apoptosis assay may also be used to testwhether MBCAT function plays a direct role in apoptosis. For example, anapoptosis assay may be performed on cells that over- or under-expressMBCAT relative to wild type cells. Differences in apoptotic responsecompared to wild type cells suggests that the MBCAT plays a direct rolein the apoptotic response. Apoptosis assays are described further inU.S. Pat. No. 6,133,437.

[0082] Cell proliferation and cell cycle assays. Cell proliferation maybe assayed via bromodeoxyuridine (BRDU) incorporation. This assayidentifies a cell population undergoing DNA synthesis by incorporationof BRDU into newly-synthesized DNA. Newly-synthesized DNA may then bedetected using an anti-BRDU antibody (Hoshino et al., 1986, Int. J.Cancer 38, 369; Campana et al., 1988, J. Immunol. Meth. 107, 79), or byother means.

[0083] Cell proliferation is also assayed via phospho-histone H3staining, which identifies a cell population undergoing mitosis byphosphorylation of histone H3. Phosphorylation of histone H3 at serine10 is detected using an antibody specific to the phosphorylated form ofthe serine 10 residue of histone H3. (Chadlee, D. N. 1995, J. Biol. Chem270:20098-105). Cell Proliferation may also be examined using[³H]-thymidine incorporation (Chen, J., 1996, Oncogene 13:1395-403;Jeoung, J., 1995, J. Biol. Chem. 270:18367-73). This assay allows forquantitative characterization of S-phase DNA syntheses. In this assay,cells synthesizing DNA will incorporate [³H]-thymidine into newlysynthesized DNA. Incorporation can then be measured by standardtechniques such as by counting of radioisotope in a scintillationcounter (e.g., Beckman LS 3800 Liquid Scintillation Counter). Anotherproliferation assay uses the dye Alamar Blue (available from BiosourceInternational), which fluoresces when reduced in living cells andprovides an indirect measurement of cell number (Voytik-Harbin S L etal., 1998, In Vitro Cell Dev Biol Anim 34:239-46).

[0084] Cell proliferation may also be assayed by colony formation insoft agar (Sambrook et al., Molecular Cloning, Cold Spring Harbor(1989)). For example, cells transformed with MBCAT are seeded in softagar plates, and colonies are measured and counted after two weeksincubation.

[0085] Involvement of a gene in the cell cycle may be assayed by flowcytometry (Gray J W et al. (1986) Int J Radiat Biol Relat Stud Phys ChemMed 49:237-55). Cells transfected with an MBCAT may be stained withpropidium iodide and evaluated in a flow cytometer (available fromBecton Dickinson), which indicates accumulation of cells in differentstages of the cell cycle.

[0086] Accordingly, a cell proliferation or cell cycle assay system maycomprise a cell that expresses an MBCAT, and that optionally hasdefective beta-catenin function (e.g. beta-catenin is over-expressed orunder-expressed relative to wild-type cells). A test agent can be addedto the assay system and changes in cell proliferation or cell cyclerelative to controls where no test agent is added, identify candidatebeta-catenin modulating agents. In some embodiments of the invention,the cell proliferation or cell cycle assay may be used as a secondaryassay to test a candidate beta-catenin modulating agents that isinitially identified using another assay system such as a cell-freeassay system. A cell proliferation assay may also be used to testwhether MBCAT function plays a direct role in cell proliferation or cellcycle. For example, a cell proliferation or cell cycle assay may beperformed on cells that over- or under-express MBCAT relative to wildtype cells. Differences in proliferation or cell cycle compared to wildtype cells suggests that the MBCAT plays a direct role in cellproliferation or cell cycle.

[0087] Angiogenesis. Angiogenesis may be assayed using various humanendothelial cell systems, such as umbilical vein, coronary artery, ordermal cells. Suitable assays include Alamar Blue based assays(available from Biosource International) to measure proliferation;migration assays using fluorescent molecules, such as the use of BectonDickinson Falcon HTS FluoroBlock cell culture inserts to measuremigration of cells through membranes in presence or absence ofangiogenesis enhancer or suppressors; and tubule formation assays basedon the formation of tubular structures by endothelial cells on Matrigel®(Becton Dickinson). Accordingly, an angiogenesis assay system maycomprise a cell that expresses an MBCAT, and that optionally hasdefective beta-catenin function (e.g. beta-catenin is over-expressed orunder-expressed relative to wild-type cells). A test agent can be addedto the angiogenesis assay system and changes in angiogenesis relative tocontrols where no test agent is added, identify candidate beta-cateninmodulating agents. In some embodiments of the invention, theangiogenesis assay may be used as a secondary assay to test a candidatebeta-catenin modulating agents that is initially identified usinganother assay system. An angiogenesis assay may also be used to testwhether MBCAT function plays a direct role in cell proliferation. Forexample, an angiogenesis assay may be performed on cells that over- orunder-express MBCAT relative to wild type cells. Differences inangiogenesis compared to wild type cells suggests that the MBCAT plays adirect role in angiogenesis. U.S. Pat. Nos. 5,976,782, 6,225,118 and6,444,434, among others, describe various angiogenesis assays.

[0088] Hypoxic induction. The alpha subunit of the transcription factor,hypoxia inducible factor-1 (HIF-1), is upregulated in tumor cellsfollowing exposure to hypoxia in vitro. Under hypoxic conditions, HIF-1stimulates the expression of genes known to be important in tumour cellsurvival, such as those encoding glyolytic enzymes and VEGF. Inductionof such genes by hypoxic conditions may be assayed by growing cellstransfected with MBCAT in hypoxic conditions (such as with 0.1% O2, 5%CO2, and balance N2, generated in a Napco 7001 incubator (PrecisionScientific)) and normoxic conditions, followed by assessment of geneactivity or expression by Taqman®. For example, a hypoxic inductionassay system may comprise a cell that expresses an MBCAT, and thatoptionally has defective beta-catenin function (e.g. beta-catenin isover-expressed or under-expressed relative to wild-type cells). A testagent can be added to the hypoxic induction assay system and changes inhypoxic response relative to controls where no test agent is added,identify candidate beta-catenin modulating agents. In some embodimentsof the invention, the hypoxic induction assay may be used as a secondaryassay to test a candidate beta-catenin modulating agents that isinitially identified using another assay system. A hypoxic inductionassay may also be used to test whether MBCAT function plays a directrole in the hypoxic response. For example, a hypoxic induction assay maybe performed on cells that over- or under-express MBCAT relative to wildtype cells. Differences in hypoxic response compared to wild type cellssuggests that the MBCAT plays a direct role in hypoxic induction.

[0089] Cell adhesion. Cell adhesion assays measure adhesion of cells topurified adhesion proteins, or adhesion of cells to each other, inpresence or absence of candidate modulating agents. Cell-proteinadhesion assays measure the ability of agents to modulate the adhesionof cells to purified proteins. For example, recombinant proteins areproduced, diluted to 2.5 g/mL in PBS, and used to coat the wells of amicrotiter plate. The wells used for negative control are not coated.Coated wells are then washed, blocked with 1% BSA, and washed again.Compounds are diluted to 2× final test concentration and added to theblocked, coated wells. Cells are then added to the wells, and theunbound cells are washed off. Retained cells are labeled directly on theplate by adding a membrane-permeable fluorescent dye, such ascalcein-AM, and the signal is quantified in a fluorescent microplatereader.

[0090] Cell-cell adhesion assays measure the ability of agents tomodulate binding of cell adhesion proteins with their native ligands.These assays use cells that naturally or recombinantly express theadhesion protein of choice. In an exemplary assay, cells expressing thecell adhesion protein are plated in wells of a multiwell plate. Cellsexpressing the ligand are labeled with a membrane-permeable fluorescentdye, such as BCECF, and allowed to adhere to the monolayers in thepresence of candidate agents. Unbound cells are washed off, and boundcells are detected using a fluorescence plate reader.

[0091] High-throughput cell adhesion assays have also been described. Inone such assay, small molecule ligands and peptides are bound to thesurface of microscope slides using a microarray spotter, intact cellsare then contacted with the slides, and unbound cells are washed off. Inthis assay, not only the binding specificity of the peptides andmodulators against cell lines are determined, but also the functionalcell signaling of attached cells using immunofluorescence techniques insitu on the microchip is measured (Falsey J R et al., Bioconjug Chem.2001 May-June;12(3):346-53).

[0092] Tubulogenesis. Tubulogenesis assays monitor the ability ofcultured cells, generally endothelial cells, to form tubular structureson a matrix substrate, which generally simulates the environment of theextracellular matrix. Exemplary substrates include Matrigel™ (BectonDickinson), an extract of basement membrane proteins containing laminin,collagen IV, and heparin sulfate proteoglycan, which is liquid at 4° C.and forms a solid gel at 37° C. Other suitable matrices compriseextracellular components such as collagen, fibronectin, and/or fibrin.Cells are stimulated with a pro-angiogenic stimulant, and their abilityto form tubules is detected by imaging. Tubules can generally bedetected after an overnight incubation with stimuli, but longer orshorter time frames may also be used. Tube formation assays are wellknown in the art (e.g., Jones M K et al., 1999, Nature Medicine5:1418-1423). These assays have traditionally involved stimulation withserum or with the growth factors FGF or VEGF. Serum represents anundefined source of growth factors. In a preferred embodiment, the assayis performed with cells cultured in serum free medium, in order tocontrol which process or pathway a candidate agent modulates. Moreover,we have found that different target genes respond differently tostimulation with different pro-angiogenic agents, including inflammatoryangiogenic factors such as TNF-alpa. Thus, in a further preferredembodiment, a tubulogenesis assay system comprises testing an MBCAT'sresponse to a variety of factors, such as FGF, VEGF, phorbol myristateacetate (PMA), TNF-alpha, ephrin, etc.

[0093] Cell Migration. An invasion/migration assay (also called amigration assay) tests the ability of cells to overcome a physicalbarrier and to migrate towards pro-angiogenic signals. Migration assaysare known in the art (e.g., Paik J H et al., 2001, J Biol Chem276:11830-11837). In a typical experimental set-up, cultured endothelialcells are seeded onto a matrix-coated porous lamina, with pore sizesgenerally smaller than typical cell size. The matrix generally simulatesthe environment of the extracellular matrix, as described above. Thelamina is typically a membrane, such as the transwell polycarbonatemembrane (Corning Costar Corporation, Cambridge, Mass.), and isgenerally part of an upper chamber that is in fluid contact with a lowerchamber containing pro-angiogenic stimuli. Migration is generallyassayed after an overnight incubation with stimuli, but longer orshorter time frames may also be used. Migration is assessed as thenumber of cells that crossed the lamina, and may be detected by stainingcells with hemotoxylin solution (VWR Scientific, South San Francisco,Calif.), or by any other method for determining cell number. In anotherexemplary set up, cells are fluorescently labeled and migration isdetected using fluorescent readings, for instance using the Falcon HTSFluoroBlok (Becton Dickinson). While some migration is observed in theabsence of stimulus, migration is greatly increased in response topro-angiogenic factors. As described above, a preferred assay system formigration/invasion assays comprises testing an MBCAT's response to avariety of pro-angiogenic factors, including tumor angiogenic andinflammatory angiogenic agents, and culturing the cells in serum freemedium.

[0094] Sprouting assay. A sprouting assay is a three-dimensional invitro angiogenesis assay that uses a cell-number defined spheroidaggregation of endothelial cells (“spheroid”), embedded in a collagengel-based matrix. The spheroid can serve as a starting point for thesprouting of capillary-like structures by invasion into theextracellular matrix (termed “cell sprouting”) and the subsequentformation of complex anastomosing networks (Korff and Augustin, 1999, JCell Sci 112:3249-58). In an exemplary experimental set-up, spheroidsare prepared by pipetting 400 human umbilical vein endothelial cellsinto individual wells of a nonadhesive 96-well plates to allow overnightspheroidal aggregation (Korff and Augustin: J Cell Biol 143: 1341-52,1998). Spheroids are harvested and seeded in 900 μl of methocel-collagensolution and pipetted into individual wells of a 24 well plate to allowcollagen gel polymerization. Test agents are added after 30 min bypipetting 100 μl of 10-fold concentrated working dilution of the testsubstances on top of the gel. Plates are incubated at 37° C. for 24 h.Dishes are fixed at the end of the experimental incubation period byaddition of paraformaldehyde. Sprouting intensity of endothelial cellscan be quantitated by an automated image analysis system to determinethe cumulative sprout length per spheroid.

[0095] Primary Assays for Antibody Modulators

[0096] For antibody modulators, appropriate primary assays test is abinding assay that tests the antibody's affinity to and specificity forthe MBCAT protein. Methods for testing antibody affinity and specificityare well known in the art (Harlow and Lane, 1988, 1999, supra). Theenzyme-linked immunosorbant assay (ELISA) is a preferred method fordetecting MBCAT-specific antibodies; others include FACS assays,radioimmunoassays, and fluorescent assays.

[0097] In some cases, screening assays described for small moleculemodulators may also be used to test antibody modulators.

[0098] Primary Assays for Nucleic Acid Modulators

[0099] For nucleic acid modulators, primary assays may test the abilityof the nucleic acid modulator to inhibit or enhance MBCAT geneexpression, preferably mRNA expression. In general, expression analysiscomprises comparing MBCAT expression in like populations of cells (e.g.,two pools of cells that endogenously or recombinantly express MBCAT) inthe presence and absence of the nucleic acid modulator. Methods foranalyzing mRNA and protein expression are well known in the art. Forinstance, Northern blotting, slot blotting, ribonuclease protection,quantitative RT-PCR (e.g., using the TaqMan®, PE Applied Biosystems), ormicroarray analysis may be used to confirm that MBCAT mRNA expression isreduced in cells treated with the nucleic acid modulator (e.g., CurrentProtocols in Molecular Biology (1994) Ausubel FM et al., eds., JohnWiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999)26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm D H andGuiseppi-Elie, A Curr Opin Biotechnol 2001, 12:41-47). Proteinexpression may also be monitored. Proteins are most commonly detectedwith specific antibodies or antisera directed against either the MBCATprotein or specific peptides. A variety of means including Westernblotting, ELISA, or in situ detection, are available (Harlow E and LaneD, 1988 and 1999, supra).

[0100] In some cases, screening assays described for small moleculemodulators, particularly in assay systems that involve MBCAT mRNAexpression, may also be used to test nucleic acid modulators.

[0101] Secondary Assays

[0102] Secondary assays may be used to further assess the activity ofMBCAT-modulating agent identified by any of the above methods to confirmthat the modulating agent affects MBCAT in a manner relevant to thebeta-catenin pathway. As used herein, MBCAT-modulating agents encompasscandidate clinical compounds or other agents derived from previouslyidentified modulating agent. Secondary assays can also be used to testthe activity of a modulating agent on a particular genetic orbiochemical pathway or to test the specificity of the modulating agent'sinteraction with MBCAT.

[0103] Secondary assays generally compare like populations of cells oranimals (e.g., two pools of cells or animals that endogenously orrecombinantly express MBCAT) in the presence and absence of thecandidate modulator. In general, such assays test whether treatment ofcells or animals with a candidate MBCAT-modulating agent results inchanges in the beta-catenin pathway in comparison to untreated (or mock-or placebo-treated) cells or animals. Certain assays use “sensitizedgenetic backgrounds”, which, as used herein, describe cells or animalsengineered for altered expression of genes in the beta-catenin orinteracting pathways.

[0104] Cell-Based Assays

[0105] Cell based assays may use a variety of mammalian cell lines knownto have defective beta-catenin function. Cell based assays may detectendogenous beta-catenin pathway activity or may rely on recombinantexpression of beta-catenin pathway components. Any of the aforementionedassays may be used in this cell-based format. Candidate modulators aretypically added to the cell media but may also be injected into cells ordelivered by any other efficacious means.

[0106] Animal Assays

[0107] A variety of non-human animal models of normal or defectivebeta-catenin pathway may be used to test candidate MBCAT modulators.Models for defective beta-catenin pathway typically use geneticallymodified animals that have been engineered to mis-express (e.g.,over-express or lack expression in) genes involved in the beta-cateninpathway. Assays generally require systemic delivery of the candidatemodulators, such as by oral administration, injection, etc.

[0108] In a preferred embodiment, beta-catenin pathway activity isassessed by monitoring neovascularization and angiogenesis. Animalmodels with defective and normal beta-catenin are used to test thecandidate modulator's affect on MBCAT in Matrigel® assays. Matrigel® isan extract of basement membrane proteins, and is composed primarily oflaminin, collagen IV, and heparin sulfate proteoglycan. It is providedas a sterile liquid at 4° C., but rapidly forms a solid gel at 37° C.Liquid Matrigel® is mixed with various angiogenic agents, such as bFGFand VEGF, or with human tumor cells which over-express the MBCAT. Themixture is then injected subcutaneously (SC) into female athymic nudemice (Taconic, Germantown, N.Y.) to support an intense vascularresponse. Mice with Matrigel® pellets may be dosed via oral (PO),intraperitoneal (IP), or intravenous (IV) routes with the candidatemodulator. Mice are euthanized 5-12 days post-injection, and theMatrigel® pellet is harvested for hemoglobin analysis (Sigma plasmahemoglobin kit). Hemoglobin content of the gel is found to correlate thedegree of neovascularization in the gel.

[0109] In another preferred embodiment, the effect of the candidatemodulator on MBCAT is assessed via tumorigenicity assays. Tumorxenograft assays are known in the art (see, e.g., Ogawa K et al., 2000,Oncogene 19:6043-6052). Xenografts are typically implanted SC intofemale athymic mice, 6-7 week old, as single cell suspensions eitherfrom a pre-existing tumor or from in vitro culture. The tumors whichexpress the MBCAT endogenously are injected in the flank, 1×10⁵ to 1×10⁷cells per mouse in a volume of 100 μL using a 27 gauge needle. Mice arethen ear tagged and tumors are measured twice weekly. Candidatemodulator treatment is initiated on the day the mean tumor weightreaches 100 mg. Candidate modulator is delivered IV, SC, IP, or PO bybolus administration. Depending upon the pharmacokinetics of each uniquecandidate modulator, dosing can be performed multiple times per day. Thetumor weight is assessed by measuring perpendicular diameters with acaliper and calculated by multiplying the measurements of diameters intwo dimensions. At the end of the experiment, the excised tumors maybeutilized for biomarker identification or further analyses. Forimmunohistochemistry staining, xenograft tumors are fixed in 4%paraformaldehyde, 0.1M phosphate, pH 7.2, for 6 hours at 4° C., immersedin 30% sucrose in PBS, and rapidly frozen in isopentane cooled withliquid nitrogen.

[0110] In another preferred embodiment, tumorogenicity is monitoredusing a hollow fiber assay, which is described in U.S. Pat No. U.S. Pat.No. 5,698,413. Briefly, the method comprises implanting into alaboratory animal a biocompatible, semi-permeable encapsulation devicecontaining target cells, treating the laboratory animal with a candidatemodulating agent, and evaluating the target cells for reaction to thecandidate modulator. Implanted cells are generally human cells from apre-existing tumor or a tumor cell line. After an appropriate period oftime, generally around six days, the implanted samples are harvested forevaluation of the candidate modulator. Tumorogenicity and modulatorefficacy may be evaluated by assaying the quantity of viable cellspresent in the macrocapsule, which can be determined by tests known inthe art, for example, MTT dye conversion assay, neutral red dye uptake,trypan blue staining, viable cell counts, the number of colonies formedin soft agar, the capacity of the cells to recover and replicate invitro, etc.

[0111] In another preferred embodiment, a tumorogenicity assay use atransgenic animal, usually a mouse, carrying a dominant oncogene ortumor suppressor gene knockout under the control of tissue specificregulatory sequences; these assays are generally referred to astransgenic tumor assays. In a preferred application, tumor developmentin the transgenic model is well characterized or is controlled. In anexemplary model, the “RIP1-Tag2” transgene, comprising the SV40 largeT-antigen oncogene under control of the insulin gene regulatory regionsis expressed in pancreatic beta cells and results in islet cellcarcinomas (Hanahan D, 1985, Nature 315:115-122; Parangi S et al, 1996,Proc Natl Acad Sci USA 93: 2002-2007; Bergers G et al, 1999, Science284:808-812). An “angiogenic switch,” occurs at approximately fiveweeks, as normally quiescent capillaries in a subset ofhyperproliferative islets become angiogenic. The RIP1-TAG2 mice die byage 14 weeks. Candidate modulators may be administered at a variety ofstages, including just prior to the angiogenic switch (e.g., for a modelof tumor prevention), during the growth of small tumors (e.g., for amodel of intervention), or during the growth of large and/or invasivetumors (e.g., for a model of regression). Tumorogenicity and modulatorefficacy can be evaluating life-span extension and/or tumorcharacteristics, including number of tumors, tumor size, tumormorphology, vessel density, apoptotic index, etc.

[0112] Diagnostic and Therapeutic Uses

[0113] Specific MBCAT-modulating agents are useful in a variety ofdiagnostic and therapeutic applications where disease or diseaseprognosis is related to defects in the beta-catenin pathway, such asangiogenic, apoptotic, or cell proliferation disorders. Accordingly, theinvention also provides methods for modulating the beta-catenin pathwayin a cell, preferably a cell pre-determined to have defective orimpaired beta-catenin function (e.g. due to overexpression,underexpression, or misexpression of beta-catenin, or due to genemutations), comprising the step of administering an agent to the cellthat specifically modulates MBCAT activity. Preferably, the modulatingagent produces a detectable phenotypic change in the cell indicatingthat the beta-catenin function is restored. The phrase “function isrestored”, and equivalents, as used herein, means that the desiredphenotype is achieved, or is brought closer to normal compared tountreated cells. For example, with restored beta-catenin function, cellproliferation and/or progression through cell cycle may normalize, or bebrought closer to normal relative to untreated cells. The invention alsoprovides methods for treating disorders or disease associated withimpaired beta-catenin function by administering a therapeuticallyeffective amount of an MBCAT-modulating agent that modulates thebeta-catenin pathway. The invention further provides methods formodulating MBCAT function in a cell, preferably a cell pre-determined tohave defective or impaired MBCAT function, by administering anMBCAT-modulating agent. Additionally, the invention provides a methodfor treating disorders or disease associated with impaired MBCATfunction by administering a therapeutically effective amount of anMBCAT-modulating agent.

[0114] The discovery that MBCAT is implicated in beta-catenin pathwayprovides for a variety of methods that can be employed for thediagnostic and prognostic evaluation of diseases and disorders involvingdefects in the beta-catenin pathway and for the identification ofsubjects having a predisposition to such diseases and disorders.

[0115] Various expression analysis methods can be used to diagnosewhether MBCAT expression occurs in a particular sample, includingNorthern blotting, slot blotting, ribonuclease protection, quantitativeRT-PCR, and microarray analysis. (e.g., Current Protocols in MolecularBiology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc.,chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125;Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm and Guiseppi-Elie, CurrOpin Biotechnol 2001, 12:41-47). Tissues having a disease or disorderimplicating defective beta-catenin signaling that express an MBCAT, areidentified as amenable to treatment with an MBCAT modulating agent. In apreferred application, the beta-catenin defective tissue overexpressesan MBCAT relative to normal tissue. For example, a Northern blotanalysis of mRNA from tumor and normal cell lines, or from tumor andmatching normal tissue samples from the same patient, using full orpartial MBCAT cDNA sequences as probes, can determine whether particulartumors express or overexpress MBCAT. Alternatively, the TaqMan® is usedfor quantitative RT-PCR analysis of MBCAT expression in cell lines,normal tissues and tumor samples (PE Applied Biosystems).

[0116] Various other diagnostic methods may be performed, for example,utilizing reagents such as the MBCAT oligonucleotides, and antibodiesdirected against an MBCAT, as described above for: (1) the detection ofthe presence of MBCAT gene mutations, or the detection of either over-or under-expression of MBCAT mRNA relative to the non-disorder state;(2) the detection of either an over- or an under-abundance of MBCAT geneproduct relative to the non-disorder state; and (3) the detection ofperturbations or abnormalities in the signal transduction pathwaymediated by MBCAT.

[0117] Thus, in a specific embodiment, the invention is drawn to amethod for diagnosing a disease or disorder in a patient that isassociated with alterations in MBCAT expression, the method comprising:a) obtaining a biological sample from the patient; b) contacting thesample with a probe for MBCAT expression; c) comparing results from step(b) with a control; and d) determining whether step (c) indicates alikelihood of the disease or disorder. Preferably, the disease iscancer, most preferably a cancer as shown in TABLE 1. The probe may beeither DNA or protein, including an antibody.

EXAMPLES

[0118] The following experimental section and examples are offered byway of illustration and not by way of limitation.

[0119] I. C. elegans Beta-Catenin Screen

[0120] The identification of mutants that suppress the cell adhesiondefect of beta-catenin may lead to unique therapeutic targets thatinhibit cell migration or metastasis. hmp-2 was initially identified inan EMS screen for defects in body elongation during embryonicmorphogenesis (see Costa et al., (1998) The Journal of Cell Biology1998, 141: 297-308). The loss of function allele hmp-2 (zu364) exhibits99% embryonic lethality, with mutant embryos arresting during elongationand abnormal bulges forming on the dorsal side. About 1% of theseembryos hatch to form viable lumpy larvae. The reduction of functionallele hmp-2 (qm39) yields viable larvae with a characteristic lumpyappearance. When grown at 15° C., approximately 92% (SD 3.9) of the L1larvae show this lumpy phenotype, with the penetrance of the phenotypedecreasing as the animals molt and move through successive larvalstages. For this screen, hmp-2 (qm39) worms were soaked at 15° C. indouble stranded RNA (dsRNA) at the L4 larval stage and the progeny werescored as L1 larvae for modification of the adhesion defect. The screenprotocol is described below.

[0121] 1) hmp-2 (qm39) animals were bleached and hatched on peptone freeagarose plates to produce a synchronous population. Starved L1s weretransferred to 10× peptone plates seeded with 750 μl OP50 (25% w/v inTB) and allowed to develop to the L4 larval stage.

[0122] 2) dsRNA was dispensed in 6 μl aliquots into 96 well round bottomplates (Nunc #262162). L4 animals were collected by suspension in M9buffer, washed 2× with M9 to remove any excess OP50, and dispensed in 2μl aliquots into the RNA to a total worm density of 75-100 worms perwell. As a control, multiple wells contained only RNA resuspensionbuffer (lx IM buffer).

[0123] 3) Animals were soaked in dsRNA at 15° C. for 24 hours.

[0124] 4) Following dsRNA soaking, the animals were fed in the wells byaddition of 25 μl liquid NGM+3% OP50. The animals were kept at 15° C.and allowed to become gravid and lay progeny in the wells, which tookapproximately 72 hours. Food levels were monitored visually duringmaturation and more was added as needed.

[0125] 5) Following maturation, animals from each well were plated ontoindividual 6 cm peptone free agarose plates and placed at 15° C.overnight.

[0126] 6) Animals on each plate were scored visually under thedissecting microscope for modification of the lumpy phenotype. Scoringwas performed qualitatively, with an increase in dead embryos scored asenhancement and an increase in wild type appearing animals scored assuppression of the defect.

[0127] 7) Retests of interesting suppressor candidates followed the sameprotocol as the primary screen with certain modifications: severalretests were performed for each suppressor, retested candidates wereencoded so that they could be scored blindly, and retested candidateswere scored quantitatively. Each plate was scored by counting 100 totalobjects. An object was defined as either an embryo or an L1 stage larva.Each object was scored as one of the following: a wildtype appearinganimal, a lumpy appearing animal, or an unhatched embryo. Scores wererepresented as the percentage of wildtype appearing animals relative toall objects scored. Wildtype animals were defined as L1 larvae withsmooth cuticles that did not have any sort of lumpy body morphology.

[0128] 8) A confirmed suppressor was one that was ≧2 standard deviationsaway from the mean of the controls for at least 3 of the four retestexperiments.

[0129] We identified C 10C6.1 as a suppressor from this screen. Humanorthologs of the modifier are referred to herein as MBCAT.

[0130] BLAST analysis (Altschul et al., supra) was employed to identifyorthologs of C. elegans modifier. For example, representative sequencesfrom MBCAT, GI#s 14149671, 14759411, 17455844, and 27498257 (SEQ IDNOs:14, 15, 16, and 18, respectively) share 41%, 36%, 43% and 41% aminoacid identity, respectively, with the C. elegnas C10C6.1.

[0131] Various domains, signals, and functional subunits in proteinswere analyzed using the PSORT (Nakai K., and Horton P., Trends BiochemSci, 1999, 24:34-6; Kenta Nakai, Protein sorting signals and predictionof subcellular localization, Adv. Protein Chem. 54, 277-344 (2000)),PFAM (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2), SMART(Ponting C P, et al., SMART: identification, and annotation of domainsfrom signaling and extracellular protein sequences. Nucleic Acids Res.Jan. 1, 1999;27(1):229-32), TM-HMM (Erik L. L. Sonnhammer, Gunnar vonHeijne, and Anders Krogh: A hidden Markov model for predictingtransmembrane helices in protein sequences. In Proc. of Sixth Int. Conf.on Intelligent Systems for Molecular Biology, p 175-182 Ed J. Glasgow,T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen MenloPark, Calif.: AAAI Press, 1998), and dust (Remm M, and Sonnhammer E.Classification of transmembrane protein families in the Caenorhabditiselegans genome and identification of human orthologs. Genome Res. 2000November;10(11):1679-89) programs. For example, the kinase domain ofMBCAT from GI#s 14149671, 14759411, 17455844, and 27498257 (SEQ ID NOs:14, 15, 16, and 18, respectively) are located respectively atapproximately amino acid residues 512 to 785, 460 to 547, 197 to 470,and 39 to 312 (PFAM 00069).

[0132] II. High-Throughput In Vitro Fluorescence Polarization Assay

[0133] Fluorescently-labeled MBCAT peptide/substrate are added to eachwell of a 96-well microtiter plate, along with a test agent in a testbuffer (10 mM HEPES, 10 mM NaCl, 6 mM magnesium chloride, pH 7.6).Changes in fluorescence polarization, determined by using a FluoroliteFPM-2 Fluorescence Polarization Microtiter System (DynatechLaboratories, Inc), relative to control values indicates the testcompound is a candidate modifier of MBCAT activity.

[0134] III. High-Throuphput in Vitro Binding Assay.

[0135]³³P-labeled MBCAT peptide is added in an assay buffer (100 mM KCl,20 mM HEPES pH 7.6, 1 mM MgCl₂, 1% glycerol, 0.5% NP-40, 50 mMbeta-mercaptoethanol, 1 mg/ml BSA, cocktail of protease inhibitors)along with a test agent to the wells of a Neutralite-avidin coated assayplate and incubated at 25° C. for 1 hour. Biotinylated substrate is thenadded to each well and incubated for 1 hour. Reactions are stopped bywashing with PBS, and counted in a scintillation counter. Test agentsthat cause a difference in activity relative to control without testagent are identified as candidate beta-catenin modulating agents.

[0136] IV. Immunoprecipitations and Immunoblotting

[0137] For coprecipitation of transfected proteins, 3×10⁶ appropriaterecombinant cells containing the MBCAT proteins are plated on 10-cmdishes and transfected on the following day with expression constructs.The total amount of DNA is kept constant in each transfection by addingempty vector. After 24 h, cells are collected, washed once withphosphate-buffered saline and lysed for 20 min on ice in 1 ml of lysisbuffer containing 50 mM Hepes, pH 7.9, 250 mM NaCl, 20mM-glycerophosphate, 1 mM sodium orthovanadate, 5 mM p-nitrophenylphosphate, 2 mM dithiothreitol, protease inhibitors (complete, RocheMolecular Biochemicals), and 1% Nonidet P-40. Cellular debris is removedby centrifugation twice at 15,000×g for 15 min. The cell lysate isincubated with 25 μl of M2 beads (Sigma) for 2 h at 4° C. with gentlerocking.

[0138] After extensive washing with lysis buffer, proteins bound to thebeads are solubilized by boiling in SDS sample buffer, fractionated bySDS-polyacrylamide gel electrophoresis, transferred to polyvinylidenedifluoride membrane and blotted with the indicated antibodies. Thereactive bands are visualized with horseradish peroxidase coupled to theappropriate secondary antibodies and the enhanced chemiluminescence(ECL) Western blotting detection system (Amersham Pharmacia Biotech).

[0139] V. Kinase Assay

[0140] A purified or partially purified MBCAT is diluted in a suitablereaction buffer, e.g., 50 mM Hepes, pH 7.5, containing magnesiumchloride or manganese chloride (1-20 mM) and a peptide or polypeptidesubstrate, such as myelin basic protein or casein (1-10 μg/ml). Thefinal concentration of the kinase is 1-20 nM. The enzyme reaction isconducted in microtiter plates to facilitate optimization of reactionconditions by increasing assay throughput. A 96-well microtiter plate isemployed using a final volume 30-100 μl. The reaction is initiated bythe addition of ³³P-gamma-ATP (0.5 μCl/ml) and incubated for 0.5 to 3hours at room temperature. Negative controls are provided by theaddition of EDTA, which chelates the divalent cation (Mg2⁺ or Mn²⁺)required for enzymatic activity. Following the incubation, the enzymereaction is quenched using EDTA. Samples of the reaction are transferredto a 96-well glass fiber filter plate (MultiScreen, Millipore). Thefilters are subsequently washed with phosphate-buffered saline, dilutephosphoric acid (0.5%) or other suitable medium to remove excessradiolabeled ATP. Scintillation cocktail is added to the filter plateand the incorporated radioactivity is quantitated by scintillationcounting (Wallac/Perkin Elmer). Activity is defined by the amount ofradioactivity detected following subtraction of the negative controlreaction value (EDTA quench).

[0141] VI. Expression Analysis

[0142] All cell lines used in the following experiments are NCI(National Cancer Institute) lines, and are available from ATCC (AmericanType Culture Collection, Manassas, Va. 20110-2209). Normal and tumortissues were obtained from Impath, U C Davis, Clontech, Stratagene,Ardais, Genome Collaborative, and Ambion.

[0143] TaqMan analysis was used to assess expression levels of thedisclosed genes in various samples.

[0144] RNA was extracted from each tissue sample using Qiagen (Valencia,Calif.) RNeasy kits, following manufacturer's protocols, to a finalconcentration of 50 ng/μl. Single stranded cDNA was then synthesized byreverse transcribing the RNA samples using random hexamers and 500 ng oftotal RNA per reaction, following protocol 4304965 of Applied Biosystems(Foster City, Calif.).

[0145] Primers for expression analysis using TaqMan assay (AppliedBiosystems, Foster City, Calif.) were prepared according to the TaqManprotocols, and the following criteria: a) primer pairs were designed tospan introns to eliminate genomic contamination, and b) each primer pairproduced only one product. Expression analysis was performed using a7900HT instrument.

[0146] Taqman reactions were carried out following manufacturer'sprotocols, in 25 μl total volume for 96-well plates and 10 μl totalvolume for 384-well plates, using 300 nM primer and 250 nM probe, andapproximately 25 ng of cDNA. The standard curve for result analysis wasprepared using a universal pool of human cDNA samples, which is amixture of cDNAs from a wide variety of tissues so that the chance thata target will be present in appreciable amounts is good. The raw datawere normalized using 18S rRNA (universally expressed in all tissues andcells).

[0147] For each expression analysis, tumor tissue samples were comparedwith matched normal tissues from the same patient. A gene was consideredoverexpressed in a tumor when the level of expression of the gene was 2fold or higher in the tumor compared with its matched normal sample. Incases where normal tissue was not available, a universal pool of cDNAsamples was used instead. In these cases, a gene was consideredoverexpressed in a tumor sample when the difference of expression levelsbetween a tumor sample and the average of all normal samples from thesame tissue type was greater than 2 times the standard deviation of allnormal samples (i.e., Tumor−average (all normal samples)>2×STDEV (allnormal samples)).

[0148] Results are shown in Table 1. Number of pairs of tumor samplesand matched normal tissue from the same patient are shown for each tumortype. Percentage of the samples with at least two-fold overexpressionfor each tumor type is provided. A modulator identified by an assaydescribed herein can be further validated for therapeutic effect byadministration to a tumor in which the gene is overexpressed. A decreasein tumor growth confirms therapeutic utility of the modulator. Prior totreating a patient with the modulator, the likelihood that the patientwill respond to treatment can be diagnosed by obtaining a tumor samplefrom the patient, and assaying for expression of the gene targeted bythe modulator. The expression data for the gene(s) can also be used as adiagnostic marker for disease progression. The assay can be performed byexpression analysis as described above, by antibody directed to the genetarget, or by any other available detection method. TABLE 1 SEQ Head ID# of Col- # of and # of Kid- # of # of Ov- # of Pro- # of # of Uter- #of GI# NO: Breast Pairs on Pairs Neck Pairs ney Pairs Lung Pairs aryPairs state Pairs Skin Pairs us Pairs 18561839 11 0% 21 15% 33 12% 8 25%24 5% 21 0% 11 8% 12 0% 3 16% 19 17455843 10 5% 21 6% 33 25% 8 12% 24 5%21 0% 11 8% 12 33% 3 5% 19 10440171 3 38% 21 9% 33 12% 8 8% 24 19% 2127% 11 8% 12 67% 3 11% 19 16198348 4 38% 21 9% 33 12% 8 8% 24 19% 21 27%11 8% 12 67% 3 11% 19 10440171 3 38% 21 9% 33 12% 8 8% 24 19% 21 27% 118% 12 67% 3 11% 19 & & 16198348 4 18600993 7 14% 21 6% 33 12% 8 8% 2414% 21 27% 11 8% 12 33% 3 5% 19

[0149]

1 18 1 5737 DNA Homo sapiens 1 taggcaggcg gctgagccgg cggcgggtggcctgcccaac gtgtgctggg tgggagaagg 60 cgaggcggca gcgatgctgt ctcttccgtgaggagcgcag aggaggtcgc ggcgccggag 120 gccccagaag gctcgaaggc gccgcgggctggggtcggtg gcttagggag cccgtccggc 180 catggtggcc gcgggtggtg gttggcgcggctgcgctgcg gcccggggca gtgcggagcc 240 gggacagtcg cggcgctgac gcccgcgggccccagctgca gatatgaagc ggagccgctg 300 ccgcgaccga ccgcagccgc cgccgcccgaccgccgggag gatggagttc agcgggcagc 360 ggagctgtct cagtctttgc cgccgcgccggcgagcgccg cccgggaggc agcggctgga 420 ggagcggacg ggccccgcgg ggcccgagggcaaggagcag gatgtagtaa ctggagttag 480 tcccctgctc ttcaggaaac tcagtaatcctgacatattt tcatccactg gaaaagttaa 540 acttcagcga caactgagtc aggatgattgtaagttatgg agaggaaacc tggccagctc 600 tctatcgggt aagcagctgc tccctttgtccagcagtgta catagcagtg tgggacaggt 660 gacttggcag tcgtcaggag aagcatcaaacctggttcga atgagaaacc agtcccttgg 720 acagtctgca ccttctctta ctgctggcctgaaggagttg agccttccaa gaagaggcag 780 cttttgtcgg acaagtaacc gcaagagcttgattgtgacc tctagcacat cacctacact 840 accacggcca cactcaccac tccatggccacacaggtaac agtcctttgg acagcccccg 900 gaatttctct ccaaatgcac ctgctcacttttcttttgtt cctgcccgta ggactgatgg 960 gcggcgctgg tctttggcct ctttgccctcttcaggatat ggaactaaca ctcctagctc 1020 cactgtctca tcatcatgct cctcacaggaaaagctgcat cagttgcctt tccagcctac 1080 agctgatgag ctgcactttt tgacgaagcatttcagcaca gagagcgtac cagatgagga 1140 aggacggcag tccccagcca tgcggcctcgctcccggagc ctcagtcccg gacgatcccc 1200 agtatccttt gacagtgaaa taataatgatgaatcacgtt tacaaagaaa gattcccaaa 1260 ggccaccgca caaatggaag agcgactagcagagtttatt tcctccaaca ctccagacag 1320 cgtgctgccc ttggcagatg gagccctgagctttattcat catcaggtga ttgagatggc 1380 ccgagactgc ctggataaat ctcggagtggcctcattaca tcacaatact tctacgaact 1440 tcaagagaat ttggagaaac ttttacaagatgctcatgag cgctcagaga gctcagaagt 1500 ggcttttgtg atgcagctgg tgaaaaagctgatgattatc attgcccgcc cagcacgtct 1560 cctggaatgc ctggagtttg accctgaagagttctaccac cttttagaag cagctgaggg 1620 ccacgccaaa gagggacaag ggattaaatgtgacattccc cgctacatcg ttagccagct 1680 gggcctcacc cgggatcccc tagaagaaatggcccagttg agcagctgtg acagtcctga 1740 cactccagag acagatgatt ctattgagggccatggggca tctctgccat ctaaaaagac 1800 accctctgaa gaggacttcg agaccattaagctcatcagc aatggcgcct atggggctgt 1860 atttctggtg cggcacaagt ccacccggcagcgctttgcc atgaagaaga tcaacaagca 1920 gaacctgatc ctacggaacc agatccagcaggccttcgtg gagcgtgaca tactgacttt 1980 cgctgagaac ccctttgtgg tcagcatgttctgctccttt gataccaagc gccacttgtg 2040 catggtgatg gagtacgttg aagggggagactgtgccact ctgctgaaga atattggggc 2100 cctgcctgtg gacatggtgc gtctatactttgcggaaact gtgctggccc tggagtactt 2160 acacaactat ggcatcgtgc accgtgacctcaagcctgac aacctcctaa ttacatccat 2220 ggggcacatc aagctcacgg actttggactgtccaaaatg ggcctcatga gtctgacaac 2280 gaacttgtat gagggtcata ttgaaaaggatgcccgggaa ttcctggaca agcaggtatg 2340 cgggacccca gaatacattg cgcctgaggtgatcctgcgc cagggctatg ggaagccagt 2400 ggactggtgg gccatgggca ttatcctgtatgagttcctg gtgggctgcg tccctttttt 2460 tggagatact ccggaggagc tctttgggcaggtgatcagt gatgagattg tgtggcctga 2520 gggtgatgag gcactgcccc cagacgcccaggacctcacc tccaaactgc tccaccagaa 2580 ccctctggag agacttggca caggcagtgcctatgaggtg aagcagcacc cattctttac 2640 tggtctggac tggacaggac ttctccgccagaaggctgaa tttattcctc agttggagtc 2700 agaggatgat actagctatt ttgacacccgctcagagcga taccaccaca tggactcgga 2760 ggatgaggaa gaagtgagtg aggatggctgccttgagatc cgccagttct cttcctgctc 2820 tccaaggttc aacaaggtgt acagcagcatggagcggctc tcactgctcg aggagcgccg 2880 gacaccaccc ccgaccaagc gcagcctgagtgaggagaag gaggaccatt cagatggcct 2940 ggcagggctc aaaggccgag accggagctgggtgattggc tcccctgaga tattacggaa 3000 gcggctgtcg gtgtctgagt cgtcccacacagagagtgac tcaagccctc caatgacagt 3060 gcgacgccgc tgctcaggcc tcctggatgcgcctcggttc ccggagggcc ctgaggaggc 3120 cagcagcacc ctcaggaggc aaccacaggagggtatatgg gtcctgacac ccccatctgg 3180 agagggggta tctgggcctg tcactgaacactcaggggag cagcggccaa agctggatga 3240 ggaagctgtt ggccggagca gtggttccagtccagctatg gagacccgag gccgtgggac 3300 ctcacagctg gctgagggag ccacagccaaggccatcagt gacctggctg tgcgtagggc 3360 ccgccaccgg ctgctctctg gggactcaacagagaagcgc actgctcgcc ctgtcaacaa 3420 agtgatcaag tccgcctcag ccacagccctctcactcctc attccttcgg aacaccacac 3480 ctgctccccg ttggccagcc ccatgtccccacattctcag tcgtccaacc catcatcccg 3540 ggactcttct ccaagcaggg acttcttgccagcccttggc agcatgaggc ctcccatcat 3600 catccaccga gctggcaaga agtatggcttcaccctgcgg gccattcgcg tctacatggg 3660 tgactccgat gtctacaccg tgcaccatatggtgtggcac gtggaggatg gaggtccggc 3720 cagtgaggca gggcttcgtc aaggtgacctcatcacccat gtcaatgggg aacctgtgca 3780 tggcctggtg cacacggagg tggtggagctgatcctgaag agtggaaaca aggtggccat 3840 ttcaacaact cccctggaga acacatccattaaagtgggg ccagctcgga agggcagcta 3900 caaggccaag atggcccgaa ggagcaagaggagccgcggc aaggatgggc aagaaagcag 3960 aaaaaggagc tccctgttcc gcaagatcaccaagcaagca tccctgctcc acaccagccg 4020 cagcctttct tcccttaacc gctccttgtcatcaggggag agtgggccag gctctcccac 4080 acacagccac agcctttccc cccgatctcccactcaaggc taccgggtga cccccgatgc 4140 tgtgcattca gtgggaggga attcatcacagagcagctcc cccagctcca gcgtgcccag 4200 ttccccagcc ggctctgggc acacacggcccagctccctc cacggtctgg cacccaagct 4260 ccaacgccag taccgctctc cacggcgcaagtcagcaggc agcatcccac tgtcaccact 4320 ggcccacacc ccttctcccc cacccccaacagcttcacct cagcggtccc catcgcccct 4380 gtctggccat gtagcccagg cctttcccacaaagcttcac ttgtcacctc ccctgggcag 4440 gcaactctca cggcccaaga gtgcggagccaccccgttca ccactactca agagggtgca 4500 gtcggctgag aaactggcag cagcacttgccgcctctgag aagaagctag ccacttctcg 4560 caagcacagc cttgacctgc cccactctgaactaaagaag gaactgccgc ccagggaagt 4620 gagccctctg gaggtagttg gagccaggagtgtgctgtct ggcaaggggg ccctgccagg 4680 gaagggggtg ctgcagcctg ctccctcacgggccctaggc accctccggc aggaccgagc 4740 cgaacgacgg gagtcgctgc agaagcaagaagccattcgt gaggtggact cctcagagga 4800 cgacaccgag gaagggcctg agaacagccagggtgcacag gagctgagct tggcacctca 4860 cccagaagtg agccagagtg tggcccctaaaggagcagga gagagtgggg aagaggatcc 4920 tttcccgtcc agaggcccta ggagcctgggcccaatggtc ccaagcctat tgacagggat 4980 cacactgggg cctcccagaa tggaaagtcccagtggtccc cacaggaggc tcgggagccc 5040 acaagccatt gaggaggctg ccagctcctcctcagcaggc cccaacctag gtcagtctgg 5100 agccacagac cccatccctc ctgaaggttgctggaaggcc cagcacctcc acacccaggc 5160 actaacagca ctttctccca gcacttcgggactcaccccc accagcagtt gctctcctcc 5220 cagctccacc tctgggaagc tgagcatgtggtcctggaaa tcccttattg agggcccaga 5280 cagggcatcc ccaagcagaa aggcaaccatggcaggtggg ctagccaacc tccaggattt 5340 ggaaacacaa ctccagccca gcctaagaacctgtctccca gggagcaggg gaagacacag 5400 ccacctagtg cccccagact ggcccatccatcttatgagg atcccagcca gggctggcta 5460 tgggagtctg agtgtgcaca agcagtgaaagaggatccag ccctgagcat cacccaagtg 5520 cctgatgcct caggtgacag aaggcaggacgttccatgcc gaggctgccc cctcacccag 5580 aagtctgagc ccagcctcag gaggggccaagaaccagggg gccatcaaaa gcatcgggat 5640 ttggcattgg ttccagatga gcttttaaagcaaacatagc agttgtttgc catttcttgc 5700 actcagacct gtgtaatata tgctcctggaaaccatc 5737 2 4049 DNA Homo sapiens 2 cccgggatcc cctagaagaa atggcccagttgagcagctg tgacagtcct gacactccag 60 agacagatga ttctattgag ggccatggggcatctctgcc atctaaaaag acaccctctg 120 aagaggactt cgagaccatt aagctcatcagcaatggcgc ctatggggct gtatttctgg 180 tgcggcacaa gtccacccgg cagcgctttgccatgaagaa gatcaacaag cagaacctga 240 tcctacggaa ccagatccag caggccttcgtggagcgtga catactgact ttcgctgaga 300 acccctttgt ggtcagcatg ttctgctcctttgataccaa gcgccacttg tgcatggtga 360 tggagtacgt tgaaggggga gactgtgccactctgctgaa gaatattggg gccctgcctg 420 tggacatggt gcgtctatac tttgcggaaactgtgctggc cctggagtac ttacacaact 480 atggcatcgt gcaccgtgac ctcaagcctgacaacctcct aattacatcc atggggcaca 540 tcaagctcac ggactttgga ctgtccaaaatgggcctcat gagtctgaca acgaacttgt 600 atgagggtca tattgaaaag gatgcccgggaattcctgga caagcaggta tgcgggaccc 660 cagaatacat tgcgcctgag gtgatcctgcgccagggcta tgggaagcca gtggactggt 720 gggccatggg cattatcctg tatgagttcctggtgggctg cgtccctttt tttggagata 780 ctccggagga gctctttggg caggtgatcagtgatgagat tgtgtggcct gagggtgatg 840 aggcactgcc cccagacgcc caggacctcacctccaaact gctccaccag aaccctctgg 900 agagacttgg cacaggcagt gcctatgaggtgaagcagca cccattcttt actggtctgg 960 actggacagg acttctccgc cagaaggctgaatttattcc tcagttggag tcagaggatg 1020 atactagcta ttttgacacc cgctcagagcgataccacca catggactcg gaggatgagg 1080 aagaagtgag tgaggatggc tgccttgagatccgccagtt ctcttcctgc tctccaaggt 1140 tcaacaaggt gtacagcagc atggagcggctctcactgct cgaggagcgc cggacaccac 1200 ccccgaccaa gcgcagcctg agtgaggagaaggaggacca ttcagatggc ctggcagggc 1260 tcaaaggccg agaccggagc tgggtgattggctcccctga gatattacgg aagcggctgt 1320 cggtgtctga gtcgtcccac acagagagtgactcaagccc tccaatgaca gtgcgacgcc 1380 gctgctcagg cctcctggat gcgcctcggttcccggaggg ccctgaggag gccagcagca 1440 ccctcaggag gcaaccacag gagggtatatgggtcctgac acccccatct ggagaggggg 1500 tatctgggcc tgtcactgaa cactcaggggagcagcggcc aaagctggat gaggaagctg 1560 ttggccggag cagtggttcc agtccagctatggagacccg aggccgtggg acctcacagc 1620 tggctgaggg agccacagcc aaggccatcagtgacctggc tgtgcgtagg gcccgccacc 1680 ggctgctctc tggggactca acagagaagcgcactgctcg ccctgtcaac aaagtgatca 1740 agtccgcctc agccacagcc ctctcactcctcattccttc ggaacaccac acctgctccc 1800 cgttggccag ccccatgtcc ccacattctcagtcgtccaa cccatcatcc cgggactctt 1860 ctccaagcag ggacttcttg ccagcccttggcagcatgag gcctcccatc atcatccacc 1920 gagctggcaa gaagtatggc ttcaccctgcgggccattcg cgtctacatg ggtgactccg 1980 atgtctacac cgtgcaccat atggtgtggcacgtggagga tggaggtccg gccagtgagg 2040 cagggcttcg tcaaggtgac ctcatcacccatgtcaatgg ggaacctgtg catggcctgg 2100 tgcacacgga ggtggtggag ctgatcctgaagagtggaaa caaggtggcc atttcaacaa 2160 ctcccctgga gaacacatcc attaaagtggggccagctcg gaagggcagc tacaaggcca 2220 agatggcccg aaggagcaag aggagccgcggcaaggatgg gcaagaaagc agaaaaagga 2280 gctccctgtt ccgcaagatc accaagcaagcatccctgct ccacaccagc cgcagccttt 2340 cttcccttaa ccgctccttg tcatcaggggagagtgggcc aggctctccc acacacagcc 2400 acagcctttc cccccgatct cccactcaaggctaccgggt gacccccgat gctgtgcatt 2460 cagtgggagg gaattcatca cagagcagctcccccagctc cagcgtgccc agttccccag 2520 ccggctctgg gcacacacgg cccagctccctccacggtct ggcacccaag ctccaacgcc 2580 agtaccgctc tccacggcgc aagtcagcaggcagcatccc actgtcacca ctggcccaca 2640 ccccttctcc cccaccccca acagcttcacctcagcggtc cccatcgccc ctgtctggcc 2700 atgtagccca ggcctttccc acaaagcttcacttgtcacc tcccctgggc aggcaactct 2760 cacggcccaa gagtgcggag ccaccccgttcaccactact caagagggtg cagtcggctg 2820 agaaactggc agcagcactt gccgcctctgagaagaagct agccacttct cgcaagcaca 2880 gccttgacct gccccactct gaactaaagaaggaactgcc gcccagggaa gtgagccctc 2940 tggaggtagt tggagccagg agtgtgctgtctggcaaggg ggccctgcca gggaaggggg 3000 tgctgcagcc tgctccctca cgggccctaggcaccctccg gcaggaccga gccgaacgac 3060 gggagtcgct gcagaagcaa gaagccattcgtgaggtgga ctcctcagag gacgacaccg 3120 aggaagggcc tgagaacagc cagggtgcacaggagctgag cttggcacct cacccagaag 3180 tgagccagag tgtggcccct aaaggagcaggagagagtgg ggaagaggat cctttcccgt 3240 ccagaggccc taggagcctg ggcccaatggtcccaagcct attgacaggg atcacactgg 3300 ggcctcccag aatggaaagt cccagtggtccccacaggag gctcgggagc ccacaagcca 3360 ttgaggaggc tgccagctcc tcctcagcaggccccaacct aggtcagtct ggagccacag 3420 accccatccc tcctgaaggt tgctggaaggcccagcacct ccacacccag gcactaacag 3480 cactttctcc cagcacttcg ggactcacccccaccagcag ttgctctcct cccagctcca 3540 cctctgggaa gctgagcatg tggtcctggaaatcccttat tgagggccca gacagggcat 3600 ccccaagcag aaaggcaacc atggcaggtgggctagccaa cctccaggat ttggaaacac 3660 aactccagcc cagcctaaga acctgtctcccagggagcag gggaagacac agccacctag 3720 tgcccccaga ctggcccatc catcttatgaggatcccagc cagggctggc tatgggagtc 3780 tgagtgtgca caagcagtga aagaggatccagccctgagc atcacccaag tgcctgatgc 3840 ctcaggtgac agaaggcagg acgttccatgccgaggctgc cccctcaccc agaagtctga 3900 gcccagcctc aggaggggcc aagaaccagggggccatcaa aagcatcggg atttggcatt 3960 ggttccagat gagcttttaa agcaaacatagcagttgttt gccatttctt gcactcagac 4020 ctgtgtaata tatgctcctg gaaaccatc4049 3 2776 DNA Homo sapiens 3 attcatttgg gatctggaac ttgtcactgtcccactgctg cctcgtgggc tcttaggtct 60 tttgtggggg gatgggaggg gtgttctgaagatcatgttt ttgaagaaaa gtactttaat 120 ttttgccaat ctctaccctt tgccgaggagctgaagtaaa ccagcacatg ttttcaccca 180 catctgctcc agccctcttc ctcactaaagtcccatttag tgctgattgt gctttggcta 240 cttctcctct tgccattttc ctgaacccacgagcccacag cagtcctggc actccttgtt 300 ccagccgccc actgccgtgg agttgtcggacaagtaaccg caagagcttg attgtgacct 360 ctagcacatc acctacacta ccacggccacactcaccact ccatggccac acaggtaaca 420 gtcctttgga cagcccccgg aatttctctccaaatgcacc tgctcacttt tcttttgttc 480 ctgcccgtag gactgatggg cggcgctggtctttggcctc tttgccctct tcaggatatg 540 gaactaacac tcctagctcc actgtctcatcatcatgctc ctcacaggaa aagctgcatc 600 agttgccttt ccagcctaca gctgatgagctgcacttttt gacgaagcat ttcagcacag 660 agagcgtacc agatgaggaa ggacggcagtccccagccat gcggcctcgc tcccggagcc 720 tcagtcccgg acgatcccca gtatcctttgacagtgaaat aataatgatg aatcatgttt 780 acaaagaaag attcccaaag gccaccgcacaaatggaaga gcgactagca gagtttattt 840 cctccaacac tccagacagc gtgctgcccttggcagatgg agccctgagc tttattcatc 900 atcaggtgat tgagatggcc cgagactgcctggataaatc tcggagtggc ctcattacat 960 cacaatactt ctacgaactt caagagaatttggagaaact tttacaagat gctcatgagc 1020 gctcagagag ctcagaagtg gcttttgtgatgcagctggt gaaaaagctg atgattatca 1080 ttgcccgccc agcacgtctc ctggaatgcctggagtttga ccctgaagag ttctaccacc 1140 ttttagaagc agctgagggc cacgccaaagagggacaagg gattaaatgt gacattcccc 1200 gctacatcgt tagccagctg ggcctcacccgggatcccct agaagaaatg gcccagttga 1260 gcagctgtga cagtcctgac actccagagacagatgattc tattgagggc catggggcat 1320 ctctgccatc taaaaagaca ccctctgaagaggacttcga gaccattaag ctcatcagca 1380 atggcgccta tggggctgta tttctggtgcggcacaagtc cacccggcag cgctttgcca 1440 tgaagaagat caacaagcag aacctgatcctacggaacca gatccagcag gccttcgtgg 1500 agcgtgacat actgactttc gctgagaacccctttgtggt cagcatgttc tgctcctttg 1560 ataccaagcg ccacttgtgc atggtgatggagtacgttga agggggagac tgtgccactc 1620 tgctgaagaa tattggggcc ctgcctgtggacatggtgcg tctatacttt gcggaaactg 1680 tgctggccct ggagtactta cacaactatggcatcgtgca ccgtgacctc aagcctgaca 1740 acctcctaat tacatccatg gggcacatcaagctcacgga ctttggactg tccaaaatgg 1800 gcctcatgag tctgacaacg aacttgtatgagggtcatat tgaaaaggat gcccgggaat 1860 tcctggacaa gcaggtatgc gggaccccagaatacattgc gcctgaggtg atcctgcgcc 1920 agggctatgg gaagccagtg gactggtgggccatgggcat tatcctgtat gagttcctgg 1980 tgggctgcgt cccttttttt ggagatactccggaggagct ctttgggcag gtgatcagtg 2040 atgagattgt gtggcctgag ggtgatgaggcactgccccc agacgcccag gacctcacct 2100 ccaaactgct ccaccagaac cctctggagagacttggcac aggcagtgcc tatgaggtga 2160 agcagcaccc attctttact ggtctggactggacaggact tctccgccag aaggctgaat 2220 ttattcctca gttggagtca gaggatgatactagctattt tgacacccgc tcagagcgat 2280 accaccacat ggactcggag gatgaggaagaagtgagtga ggatggctgc cttgagatcc 2340 gccagttctc ttcctgctct ccaaggttcaacaaggtgta cagcagcatg gagcggctct 2400 cactgctcga ggagcgccgg acaccacccccgaccaagcg cagcctgagt gaggagaagg 2460 aggaccattc agatggcctg gcagggctcaaaggccgaga ccggagctgg gtgattggct 2520 cccctgagca tcacccaagt gcctgatgcctcaggtgaca gaaggcagga cgttccatgc 2580 cgaggctgcc ccctcaccca gaagtctgagcccagcctca ggaggggcca agaaccaggg 2640 ggccatcaaa agcatcggga tttggcattggttccagatg agcttttaaa gcaaacatag 2700 cagttgtttg ccatttcttg cactcagacctgtgtaatat atgctcctgg aaaccataaa 2760 aaaaaaaaaa aaaaaa 2776 4 4087 DNAHomo sapiens 4 ggcacgaggg agcgtaccag atgaggaagg acggcagtcc ccagccatgcggcctcgctc 60 ccggagcctc agtcccggac gatccccagt atcctttgac agtgaaataataatgatgaa 120 tcatgtttac aaagaaagat tcccaaaggc tcatgagcgc tcagagagctcagaagtggc 180 ttttgtgatg cagctggtga aaaagctgat gattatcatt gcccgcccagcacgtctcct 240 ggaatgcctg gagtttgacc ctgaagagtt ctaccacctt ttagaagcagctgagggcca 300 cgccaaagag ggacaaggga ttaaatgtga cattccccgc tacatcgttagccagctggg 360 cctcacccgg gatcccctag aagaaatggc ccagttgagc agctgtgacagtcctgacac 420 tccagagaca gatgattcta ttgagggcca tggggcatct ctgccatctaaaaagacacc 480 ctctgaagag gacttcgaga ccattaagct catcagcaat ggcgcctatggggctgtatt 540 tctggtgcgg cacaagtcca cccggcagcg ctttgccatg aagaagatcaacaagcagaa 600 cctgatccta cggaaccaga tccagcaggc cttcgtggag cgtgacatactgactttcgc 660 tgagaacccc tttgtggtca gcatgttctg ctcctttgat accaagcgccacttgtgcat 720 ggtgatggag tacgttgaag ggggagactg tgccactctg ctgaagaatattggggccct 780 gcctgtggac atggtgcgtc tatactttgc ggaaactgtg ctggccctggagtacttaca 840 caactatggc atcgtgcacc gtgacctcaa gcctgacaac ctcctaattacatccatggg 900 gcacatcaag ctcacggact ttggactgtc caaaatgggc ctcatgagtctgacaacgaa 960 cttgtatgag ggtcatattg aaaaggatgc ccgggaattc ctggacaagcaggtatgcgg 1020 gaccccagaa tacattgcgc ctgaggtgat cctgcgccag ggctatgggaagccagtgga 1080 ctggtgggcc atgggcatta tcctgtatga gttcctggtg ggctgcgtccctttttttgg 1140 agatactccg gaggagctct ttgggcaggt gatcagtgat gagattgtgtggcctgaggg 1200 tgatgaggca ctgcccccag acgcccagga cctcacctcc aaactgctccaccagaaccc 1260 tctggagaga cttggcacag gcagtgccta tgaggtgaag cagcacccattctttactgg 1320 tctggactgg acaggacttc tccgccagaa ggctgaattt attcctcagttggagtcaga 1380 ggatgatact agctattttg acacccgctc agagcgatac caccacatggactcggagga 1440 tgaggaagaa gtgagtgagg atggctgcct tgagatccgc cagttctcttcctgctctcc 1500 aaggttcaac aaggtgtaca gcagcatgga gcggctctca ctgctcgaggagcgccggac 1560 accacccccg accaagcgca gcctgagtga ggagaaggag gaccattcagatggcctggc 1620 agggctcaaa ggccgagacc ggagctgggt gattggctcc cctgagatattacggaagcg 1680 gctgtcggtg tctgagtcgt cccacacaga gagtgactca agccctccaatgacagtgcg 1740 acgccgctgc tcaggcctcc tggatgcgcc tcggttcccg gagggccctgaggaggccag 1800 cagcaccctc aggaggcaac cacaggaggg tatatgggtc ctgacacccccatctggaga 1860 gggggtatct gggcctgtca ctgaacactc aggggagcag cggccaaagctggatgagga 1920 agctgttggc cggagcagtg gttccagtcc agctatggag acccgaggccgtgggacctc 1980 acagctggct gagggagcca cagccaaggc catcagtgac ctggctgtgcgtagggcccg 2040 ccaccggctg ctctctgggg actcaacaga gaagcgcact gctcgccctgtcaacaaagt 2100 gatcaagtcc gcctcagcca cagccctctc actcctcatt ccttcggaacaccacacctg 2160 ctccccgttg gccagcccca tgtccccaca ttctcagtcg tccaacccatcatcccggga 2220 ctcttctcca agaagtatgg cttcaccctg cgggccattc gcgtctacatgggtgactcc 2280 gatgtctaca ccgtgcacca tatggtgtgg cacgtggagg atggaggtccggccagtgag 2340 gcagggcttc gtcaaggtga cctcatcacc catgtcaatg gggaacctgtgcatggcctg 2400 gtgcacacgg aggtggtgga gctgatcctg aagagtggaa acaaggtggccatttcaaca 2460 actcccctgg agaacacatc cattaaagtg gggccagctc ggaagggcagctacaaggcc 2520 aagatggccc gaaggagcaa gaggagccgc ggcaaggatg ggcaagaaagcagaaaaagg 2580 agctccctgt tccgcaagat caccaagcaa gcatccctgc tccacaccagccgcagcctt 2640 tcttccctta accgctcctt gtcatcaggg gagagtgggc caggctctcccacacacagc 2700 cacagccttt ccccccgatc tcccactcaa ggctaccggg tgacccccgatgctgtgcat 2760 tcaggcaact ctcacggccc aagagtgcgg agccaccccg ttcaccactactcaagaggg 2820 tgcagtcggc tgagaaactg gcagcagcac ttgccgcctc tgagaagaagctagccactt 2880 ctcgcaagca cagccttgac ctgccccact ctgaactaaa gaaggaactgccgcccaggg 2940 aagtgagccc tctggaggta gttggagcca ggagtgtgct gtctggcaagggggccctgc 3000 cagggaaggg ggtgctgcag cctgctccct cacgggccct aggcaccctccggcaggacc 3060 gagccgaacg acgggagtcg ctgcagaagc aagaagccat tcgtgaggtggactcctcag 3120 aggacgacac cgaggaaggg cctgagaaca gccagggtgc acaggagctgagcttggcac 3180 ctcacccaga agtgagccag agtgtggccc ctaaaggagc aggagagagtggggaagagg 3240 atcctttccc gtccagagac cctaggagcc tgggcccaat ggtcccaagcctattgacag 3300 ggatcacact ggggcctccc agaatggaaa gtcccagtgg tccccacaggaggctcggga 3360 gcccacaagc cattgaggag gctgccagct cctcctcagc aggccccaacctaggtcagt 3420 ctggagccac agaccccatc cctcctgaag gttgctggaa ggcccagcacctccacaccc 3480 aggcactaac agcactttct cccagcactt cgggactcac ccccaccagcagttgctctc 3540 ctcccagctc cacctctggg aagctgagca tgtggtcctg gaaatcccttattgagggcc 3600 cagacagggc atccccaagc agaaaggcaa ccatggcagg tgggctagccaacctccagg 3660 atttggaaaa cacaactcca gcccagccta agaacctgtc tcccagggagcaggggaaga 3720 cacagccacc tagtgccccc agactggccc atccatctta tgaggatcccagccagggct 3780 ggctatggga gtctgagtgt gcacaagcag tgaaagagga tccagccctgagcatcaccc 3840 aagtgcctga tgcctcaggt gacagaaggc aggacgttcc atgccgaggctgccccctca 3900 cccagaagtc tgagcccagc ctcaggaggg gccaagaacc agggggccatcaaaagcatc 3960 gggatttggc attggttcca gatgagcttt taaagcaaac atagcagttgtttgccattt 4020 cttgcactca gacctgtgta atatatgctc ctggaaccca aaaaaaaaaaaaaaaaaaaa 4080 aaaaaaa 4087 5 2113 DNA Homo sapiens 5 aatgatctagctcctgaaga aagaggcttt cgttactgtc caagaagtga agaccagtga 60 gtctagcgataaagaaacaa aagagcccct gacagtatcg aaagaatctg acatttgctt 120 tgtctgtacaaaggatgaaa ggaattttgc cgacttagtc aaaacaataa ctggttttat 180 atcattttgagtctaaaata caggcatcag aatcgtatgt ttttctttgt atttctgaag 240 aattgtcatctttttaaaaa cgtatgtgga gtggttttgt ttctggaaag gaaaacgtga 300 catggattttataagcttag tagctgtagg tggagactct tgttttgttt gctaaacctt 360 tttaataagcagcaaagtca ggatatcaaa gattggaatt tttcccccgg aaccttttgt 420 tcttttcttagtacttcaaa taatttcaac caaaggggta tgcatggaaa atgaaaacaa 480 atatgaagtctttccgctat ttggcagtgc ttcagatgag atgtatctat acttctgtga 540 agaaagacgtgattctgata tgaggcagtg attgtatact tacgctggac agagggataa 600 ctggtttagtggaacagtgg gttagtggtg tgctgaaagg attgtttgcc tctttatcca 660 gcagtaagtaggcactagta gttgactgtc tggggagtgg cttttttttg tctgatatct 720 attatggatgattccagtat tctgagaagg agagggttac agaaggagtt gagccttcca 780 agaagaggcagcttttgtcg gacaagtaac cgcaagagct tgattgtgac ctctagcaca 840 tcacctacactaccacggcc acactcacca ctccatggcc acacaggtaa cagtcctttg 900 gacagcccccggaatttctc tccaaatgca cctgctcact tttcttttgt tcctgcccgt 960 agccatagccacagagctga caggactgat gggcggcgct ggtctttggc ctctttgccc 1020 tcttcaggatatggaactaa cactcctagc tccactgtct catcatcatg ctcctcacag 1080 gaaaagctgcatcagttgcc tttccagcct acagctgatg agctgcactt tttgacgaag 1140 catttcagcacagagagcgt accagatgag gaaggacggc agtccccagc catgcagcct 1200 gctccctcacgggccctagg caccctccgg caggaccgag ccgaacgacg ggagtcgctg 1260 cagaagcaagaagccattcg tgaggtggac tcctcagagg acgacaccga ggaagggcct 1320 gagaacagccagggtgcaca ggagctgagc ttggcacctc acccagaagt gagccagagt 1380 gtggcccctaaaggagcagg agagagtggg gaagaggatc ctttcccgtc cagagaccct 1440 aggagcctgggcccaatggt cccaagccta ttgacaggga tcacactggg gcctcccaga 1500 atggaaagtcccagtggtcc ccacaggagg ctcgggagcc cacaagccat tgaggaggct 1560 gccagctcctcctcagcagg ccccaaccta ggtcagtctg gagccacaga ccctatccct 1620 cctgaaggttgctggaaggc ccagcacctc cacacccagg cactaacagc actttctccc 1680 agcacttcgggactcacccc catccatctt atgaggatcc cagccagggc tggctatggg 1740 agtctgagtgtgcacaagca gtgaaagagg atccagccct gagcatcacc caagtgcctg 1800 atgcctcaggtgacagaagg caggacgttc catgccgagg ctgccccctc acccagaagt 1860 ctgagcccagcctcaggagg ggccaagaac cagggggcca tcaaaagcat cgggatttgg 1920 cattggttccagatgagctt ttaaagcaaa catagcagtt gtttgccatt tcttgcactc 1980 agacctgtgtaatatatgct cctggaaacc atctttatgt cttttgcttg cttgttttcc 2040 ttcggtcaacccacatgtaa ctaggtcctg tgttgctgct gggaatatag tggtgaataa 2100 aacagtttccacc 2113 6 2945 DNA Homo sapiens 6 tatggatgat tccagtattc tgagaaggagagggttacag aaggagttga gccttccaag 60 aagaggcagt ttatttcttt tcccggcccagcgcgatggc ttacacctgt aatcccagca 120 ctttactttg ggaggctgag gcaagtggatcacaaggtca ggagttcaag aacagcctgg 180 ccaacatgtt gtcggacaag taaccgcaagagcttgattg tgacctctag cacatcacct 240 acactaccac ggccacactc accactccatggccacacag gtaacagtcc tttggacagc 300 ccccggaatt tctctccaaa tgcacctgctcacttttctt ttgttcctgc ccgtagccat 360 agccacagag ctgacaggac tgatgggcggcgctggtctt tggcctcttt gccctcttca 420 ggatatggaa ctaacactcc tagctccactgtctcatcat catgctcctc acaggaaaag 480 ctgcatcagt tgcctttcca gcctacagctgatgagctgc actttttgac gaagcatttc 540 agcacagaga gcgtaccaga tgaggaaggacggcagtccc cagccatgcg gcctcgctcc 600 cggagcctca gtcccggacg atccccagtatcctttgaca gtgaaataat aatgatgaat 660 catgtttaca aagaaagatt cccaaaggccaccgcacaaa tggaagagcg actagcagag 720 tttatttcct ccaacactcc agacagcgtgctgcccttgg cagatggagc cctgagcttt 780 attcatcatc aggtgattga gatggcccgagactgcctgg ataaatctcg gagtggcctc 840 attacatcac aatacttcta cgaacttcaagagaatttgg agaaactttt acaagatgag 900 tttgaccctg aagagttcta ccaccttttagaagcagctg agggccacgc caaagaggga 960 caagggatta aatgtgacat tccccgctacatcgttagcc agctgggcct cacccgggat 1020 cccctagaag aaatggccca gttgagcagctgtgacagtc ctgacactcc agagacagat 1080 gattctattg agggccatgg ggcatctctgccatctaaaa agacaccctc tgaagaggac 1140 ttcgagacca ttaagctcat cagcaatggcgcctatgggg ctgtatttct ggtgcggcac 1200 aagtccaccc ggcagcgctt tgccatgaagaagatcaaca agcagaacct gatcctacgg 1260 aaccagatcc agcaggcctt cgtggagcgtgacatactga ctttcgctga gaaccccttt 1320 gtggtcagca tgttctgctc ctttgataccaagcgccact tgtgcatggt gatggagtac 1380 gttgaagggg gagactgtgc cactctgctgaagaatattg gggccctgcc tgtggacatg 1440 gtgcgtctat actttgcgga aactgtgctggccctggagt acttacacaa ctatggcatc 1500 gtgcaccgtg acctcaagcc tgacaacctcctaattacat ccatggggca catcaagctc 1560 acggactttg gactgtccaa aattggcctcatgagtctga caacgaactt gtatgagggt 1620 catattgaaa aggatgcccg ggaattcctggacaagcagg tatgcgggac cccagaatac 1680 attgcgcctg aggtgatcct gcgccagggctatgggaagc cagtggactg gtgggccatg 1740 ggcattatcc tgtatgagtt cctggtgggctgcgtccctt tttttggaga tactccggag 1800 gagctctttg ggcaggtgat cagtgatgagattgtgtggc ctgagggtga tgaggcactg 1860 cccccagacg cccaggacct cacctccaaactgctccacc agaaccctct ggagagactt 1920 ggcacaggca gtgcctatga ggtgaagcagcacccattct ttactggtct ggactggaca 1980 ggacttctcc gccagaaggc tgaatttattcctcagttgg agtcagagga tgatactagc 2040 tattttgaca cccgctcaga gcgataccaccacatggact cggaggatga ggaagaagtg 2100 agtgaggatg gctgccttga gatccgccagttctcttcct gctctccaag gttcaacaag 2160 gtgtacagca gcatggagcg gctctcactgctcgaggagc gccggacacc acccccgacc 2220 aagcgcagcc tgagtgagga gaaggaggaccattcagatg gcctggcagg gctcaaaggc 2280 cgagaccgga gctgggtgat tggctcccctgagatattac ggaagcggct gtcggtgtct 2340 gagtcatccc tcctgaaggt tgctggaaggcccagcactt tctcccagca cttcgggact 2400 cacccccacc agcagttgct ctcctcccagctccacctct gggaagctga gcatgtggtc 2460 ctggaaatcc cttattgagg gcccagacagggcatcccca agcagaaagg caaccatggc 2520 aggtgggcta gccaacctcc aggatttggaaaacacaact ccagcccagc ctaagaacct 2580 gtctcccagg gagcagggga agacacagccacctagtgcc cccagactgg cccatccatc 2640 ttatgaggat cccagccagg gctggctatgggagtctgag tgtgcacaag cagtgaaaga 2700 ggatccagcc ctgagcatca cccaagtgcctgatgcctca ggtgacagaa ggcaggacgt 2760 tccatgccga ggctgccccc tcacccagaagtctgagccc agcctcagga ggggccaaga 2820 accagggggc catcaaaagc atcgggatttggcattggtt ccagatgagc ttttaaagca 2880 aacatagcag ttgtttgcca tttcttgcactcagacctgt gtaatatatg ctcctggaaa 2940 ccatc 2945 7 4865 DNA Homo sapiens7 tgctccccgc gccgccgccg ccgccgcctc cgccgctgct gccgcacctg ccaccatgtc 60gccgccgccg ggtcatgtct gactctctct ggaccgcgct ttctaatttc tcgatgccct 120ccttccccgg cggcagtatg ttccgccgca ccaagagctg ccgcaccagt aatcggaaaa 180gcctcatcct gaccagcact tcacccacgc taccgagacc ccactccccg ctgccaggtc 240acctaggcag cagtcccctg gacagccccc gaaacttctc ccccaacacc cccgcccact 300tctcgtttgc ctcctcccga agggcggacg gacgccggtg gtctctggcc tcgctccctt 360catctggcta tggcaccaac acgcccagtt ccaccgtctc gtcctcctgc tcctcccagg 420agcgccttca ccagctgccc taccagccca cggtggacga gctccacttc ctctccaaac 480acttcgggag caccgagagc atcacagacg aggatggtgg ccgtcgctcc ccagccgtgc 540ggccccgctc acggagcctc agccccgggc gctccccctc ctcctacgac aacgagatcg 600tgatgatgaa tcacgtctac aaggagaggt tcccgaaggc cactgcgcag atggaggaga 660agctgcgcga ctttacccgc gcctacgaac ccgacagcgt tctgcctctg gccgatggcg 720tgctcagctt catccaccac cagatcatcg agctggcccg ggactgcctg accaagtccc 780gtgacggcct catcaccacg gtctacttct atgaattgca ggagaacctg gagaagctcc 840ttcaagacgc ctatgaacgc tctgagagct tggaggtggc cttcgttact cagctggtga 900agaagttgct tattatcatc tcacgccctg cgaggctgct ggagtgcctg gaattcaacc 960ccgaggagtt ctaccacctg ctggaggcgg ccgaaggaca cgccaaggag ggccaccttg 1020tgaagacgga catcccccgc tacatcatcc gccagctggg cctcacccgt gacccctttc 1080cagatgtggt gcatctggag gaacaggaca gtggtggttc caacacccct gagcaagacg 1140atctctctga gggccgcagc agcaaggcca agaaaccgcc gggggagaat gacttcgata 1200ccatcaagct cataagcaac ggtgcctacg gcgctgtcta cctggtgcgg caccgcgaca 1260cgcggcagcg ctttgccatg aaaaagatca acaagcagaa cttgatcctc cgcaaccaga 1320tccagcaggc ctttgtggag cgcgatatcc tcaccttcgc cgagaacccg tttgtggtcg 1380gcatgttctg ctcctttgag actcggcgcc acctctgcat ggtcatggaa tatgtggaag 1440gcggcgactg tgccaccctg ctgaagaata ttggagcgct gcccgtagag atggcccgca 1500tgtactttgc tgagacggtg ctagccctgg agtatttgca caactatggc atcgtgcacc 1560gcgacctcaa gcctacagcc tccttatcac ctccatgggt cacatcaagc tcacagattt 1620cggcctctcc aagatggggc tcatgagcct caccaccaac ttatatgaag gccacatcga 1680gaaggacgcc cgagagttcc tggacaaaca ggtgtgtggg accccagagt acatcgcgcc 1740cgaggtcatc ctgcgtcaag gctacggcaa gccagtggac tggtgggcta tggggatcat 1800cctctacgag ttcctggtgg gctgtgtgcc cttcttcgga gacacaccag aggagctatt 1860tggacaggtc atcagtgatg acatcctgtg gcccgagggg gatgaggccc tacctacgga 1920ggcccaactc ctcatatcca gcctcctgca gaccaaccct ctggtcaggc ttggggcagg 1980cggcgctttt gaggtgaagc agcacagttt ctttcgagac ctggactgga cagggctgct 2040gaggcagaag gccgagttca tcccccacct agagtcggaa gatgacacta gctactttga 2100cacccgctca gacaggtatc accacgtgaa ctcctatgac gaggatgaca cgacggagga 2160ggagcccgtg gaaatccgcc agttctcttc ctgctctccg cgcttcagca aggtgtatag 2220cagcatggag cagctgtcgc agcacgagcc caagacccca gtagcagctg cagggagcag 2280caagcgggag ccgagcacca agggccccga ggagaaggtg gccggcaagc gggaggggct 2340gggcggcctg accctgcgtg agaagacctg gagagggggc tctccggaga tcaagcgatt 2400ctccgcgtcc gaggccagtt tcctggaggg agaggccagt ccccctttgg gcgcccgccg 2460ccgtttctcg gcgctgctgg agcccagccg cttcagcgcc ccccaagagg acgaggatga 2520ggcccggctg cgcaggcctc cccggcccag ctccgacccc gcgggatccc tggatgcacg 2580ggcccccaaa gaggagactc aaggggaagg cacctccagc gccggggact ccgaggccac 2640tgaccgtcca cgcccaggtg acctctgccc accctcgaag gatggggatg catcaggccc 2700aagggctacc aatgacttgg ttctgcgccg ggcgcggcac cagcagatgt caggggatgt 2760ggcagtagag aagaggcctt ctcgaactgg gggcaaagtc atcaaatcag cctcagccac 2820tgccttatct gtcatgattc ctgcagtgga cccacatgga agttcacccc ttgctagtcc 2880catgtctcca cgatctctgt cctccaaccc atcctcacgg gactcctcac ccagccggga 2940ctactcacca gctgtcagtg ggctccgctc ccccatcacc atccagcgct cgggcaagaa 3000gtatggcttc acactgcgtg ccatccgtgt ctacatgggt gacacggatg tctatagtgt 3060ccaccacatt gtctggcatg tggaggaagg aggcccagcc caggaggcag gactctgtgc 3120tggggacctc atcacccacg tgaatgggga gcctgtgcat ggcatggtgc atcctgaggt 3180cgtggagctg atccttaaga gtggcaacaa ggtagcagtg accacaacgc ccttcgaaaa 3240tacctctatc cgcattggtc ccgcaaggcg cagcagctac aaggctaaaa tggctcggag 3300gaacaagcga ccctccgcca aggagggcca ggagagcaag aagcgcagct ccctcttccg 3360gaagatcacg aagcagtcga acctgctgca tactagccgc tcgctgtcgt cgctgaaccg 3420ctcgctgtca tccagcgata gtctcccggg ctcgcctacg cacgggctgc cggcgcgctc 3480gcccacgcac agctaccgct ccacgcctga ctccgcctac ctaggcgcct catcccagag 3540cagctcccca gcctcgagca cgcccaactc gcctgcgtcg tcggcgtcgc accacattcg 3600gcccagcacg ctgcacggac tgtcgccaaa gctccatcgc cagtaccgct ctgcgcgatg 3660caagtcggcc ggcaacatcc ctctatcgcc gctggcacac acgccgtccc ccacgcaggc 3720gtcaccgccg ccactgccgg gccacacggt gggcagctcg cacactactc agagcttccc 3780ggccaaactg cactcatcgc ctcccgtcgt gcgcccgcgc cccaagagtg ccgagccccc 3840tcgctcgccg ctcctcaagc gcgtgcagtc ggccgagaag ctgggagcct ctttgagtgc 3900ggacaagaag ggcgcgctgc gcaaacacag cctcgaggtg ggccacccgg atttccgcaa 3960ggacttccat ggcgagctgg cgctgcatag ccttgccgag tccgacggtg agacgccccc 4020agtcgagggc cttggcgcgc cccggcaggt cgccgtccgc cgcctgggcc gacaggagtc 4080acctttgagc ctgggcgcgg acccgttgct gcccgagggt gcctccaggc caccagtgtc 4140gagcaaggag aaggaatccc cggggggcgc cgaggcgtgc accccacccc gcgcgacgac 4200ccccggtggc cggaccctgg agcgggacgt cggctgcacg cggcatcaga gcgtgcagac 4260ggaggatggc actggcggga tggccagggc tgtggccaag gcggcgctga gcccggtgca 4320ggaacacgag acaggccggc gcagcagctc tggcgaggcg ggcacacccc tggtacccat 4380tgtcgtagag cctgcgcggc ccggggctaa ggctgtggtg cctcagcctc tgggcgcgga 4440ctccaagggg ttgcaggaac ccgcacccct ggcgccttcc gtgcccgagg ccccccgggg 4500ccgggagcgc tgggtgttgg aggtggtgga ggagcgcacc acgctgagcg gtcctcgctc 4560caagcccgcc tccccaaagc tctccccgga gccccagaca ccctccctag ccccagcgaa 4620gtgcagtgca cccagcagtg cagtgacccc agtcccaccc gcatccctct tgggctcagg 4680caccaagcct caagtggggc tgacctcccg gtgccctgct gaagctgtgc ccccagcagg 4740cctgaccaaa aaaggagtgt ccagtcccgc acccccggga ccatagccaa gggggtcatc 4800ggccccgcgc tgtacagcct ccgtatacat atgtacacat ataaataaag tgcgtccgtg 4860ctgcg 4865 8 4740 DNA Homo sapiens 8 ccgccgggtc atgtctgact ctctctggaccgcgctttct aatttctcga tgccctcctt 60 ccccggcggc agtatgttcc gccgcaccaagagctgccgc accagtaatc ggaaaagcct 120 catcctgacc agcacttcac ccacgctaccgagaccccac tccccgctgc caggtcacct 180 aggcagcagt cccctggaca gcccccgaaacttctccccc aacacccccg cccacttctc 240 gtttgcctcc tcccgaaggg cggacggacgccggtggtct ctggcctcgc tcccttcatc 300 tggctatggc accaacacgc ccagttccaccgtctcgtcc tcctgctcct cccaggagcg 360 ccttcaccag ctgccctacc agcccacggtggacgagctc cacttcctct ccaaacactt 420 cgggagcacc gagagcatca cagacgaggatggtggccgt cgctccccag ccgtgcggcc 480 ccgctcacgg agcctcagcc ccgggcgctccccctcctcc tacgacaacg agatcgtgat 540 gatgaatcac gtctacaagg agaggttcccgaaggccact gcgcagatgg aggagaagct 600 gcgcgacttt acccgcgcct acgaacccgacagcgttctg cctctggccg atggcgtgct 660 cagcttcatc caccaccaga tcatcgagctggcccgggac tgcctgacca agtcccgtga 720 cggcctcatc accacggtct acttctatgaattgcaggag aacctggaga agctccttca 780 agacgcctat gaacgctctg agagcttggaggtggccttc gttactcagc tggtgaagaa 840 gttgcttatt atcatctcac gccctgcgaggctgctggag tgcctggaat tcaaccccga 900 ggagttctac cacctgctgg aggcggccgaaggacacgcc aaggagggcc accttgtgaa 960 gacggacatc ccccgctaca tcatccgccagctgggcctc acccgtgacc cctttccaga 1020 tgtggtgcat ctggaggaac aggacagtggtggttccaac acccctgagc aagacgatct 1080 ctctgagggc cgcagcagca aggccaagaaaccgccgggg gagaatgact tcgataccat 1140 caagctcata agcaacggtg cctacggcgctgtctacctg gtgcggcacc gcgacacgcg 1200 gcagcgcttt gccatgaaaa agatcaacaagcagaacttg atcctccgca accagatcca 1260 gcaggccttt gtggagcgcg atatcctcaccttcgccgag aacccgtttg tggtcggcat 1320 gttctgctcc tttgagactc ggcgccacctctgcatggtc atggaatatg tggaaggcgg 1380 cgactgtgcc accctgctga agaatattggagcgctgccc gtagagatgg cccgcatgta 1440 ctttgctgag acggtgctag ccctggagtatttgcacaac tatggcatcg tgcaccgcga 1500 cctcaagcct gacaacctcc ttatcacctccatgggtcac atcaagctca cagatttcgg 1560 cctctccaag atggggctca tgagcctcaccaccaactta tatgaaggcc acatcgagaa 1620 ggacgcccga gagttcctgg acaaacaggtgtgtgggacc ccagagtaca tcgcgcccga 1680 ggtcatcctg cgtcaaggct acggcaagccagtggactgg tgggctatgg ggatcatcct 1740 ctacgagttc ctggtgggct gtgtgcccttcttcggagac acaccagagg agctatttgg 1800 acaggtcatc agtgatgaca tcctgtggcccgagggggat gaggccctac ctacggaggc 1860 ccaactcctc atatccagcc tcctgcagaccaaccctctg gtcaggcttg gggcaggcgg 1920 cgcttttgag gtgaagcagc acagtttctttcgagacctg gactggacag ggctgctgag 1980 gcagaaggcc gagttcatcc cccacctagagtcggaagat gacactagct actttgacac 2040 ccgctcagac aggtatcacc acgtgaactcctatgacgag gatgacacga cggaggagga 2100 gcccgtggaa atccgccagt tctcttcctgctctccgcgc ttcagcaagg tgtatagcag 2160 catggagcag ctgtcgcagc acgagcccaagaccccagta gcagctgcag ggagcagcaa 2220 gcgggagccg agcaccaagg gccccgaggagaaggtggcc ggcaagcggg aggggctggg 2280 cggcctgacc ctgcgtgaga agacctggagagggggctct ccggagatca agcgattctc 2340 cgcgtccgag gccagtttcc tggagggagaggccagtccc cctttgggcg cccgccgccg 2400 tttctcggcg ctgctggagc ccagccgcttcagcgccccc caagaggacg aggatgaggc 2460 ccggctgcgc aggcctcccc ggcccagctccgaccccgcg ggatccctgg atgcacgggc 2520 ccccaaagag gagactcaag gggaaggcacctccagcgcc ggggactccg aggccactga 2580 ccgtccacgc ccaggtgacc tctgcccaccctcgaaggat ggggatgcat caggcccaag 2640 ggctaccaat gacttggttc tgcgccgggcgcggcaccag cagatgtcag gggatgtggc 2700 agtagagaag aggccttctc gaactgggggcaaagtcatc aaatcagcct cagccactgc 2760 cttatctgtc atgattcctg cagtggacccacatggaagt tcaccccttg ctagtcccat 2820 gtctccacga tctctgtcct ccaacccatcctcacgggac tcctcaccca gccgggacta 2880 ctcaccagct gtcagtgggc tccgctcccccatcaccatc cagcgctcgg gcaagaagta 2940 tggcttcaca ctgcgtgcca tccgtgtctacatgggtgac acggatgtct atagtgtcca 3000 ccacattgtc tggcatgtgg aggaaggaggcccagcccag gaggcaggac tctgtgctgg 3060 ggacctcatc acccacgtga atggggagcctgtgcatggc atggtgcatc ctgaggtcgt 3120 ggagctgatc cttaagagtg gcaacaaggtagcagtgacc acaacgccct tcgaaaatac 3180 ctctatccgc attggtcccg caaggcgcagcagctacaag gctaaaatgg ctcggaggaa 3240 caagcgaccc tccgccaagg agggccaggagagcaagaag cgcagctccc tcttccggaa 3300 gatcacgaag cagtcgaacc tgctgcatactagccgctcg ctgtcgtcgc tgaaccgctc 3360 gctgtcatcc agcgatagtc tcccgggctcgcctacgcac gggctgccgg cgcgctcgcc 3420 cacgcacagc taccgctcca cgcctgactccgcctaccta ggcgcctcat cccagagcag 3480 ctccccagcc tcgagcacgc ccaactcgcctgcgtcgtcg gcgtcgcacc acattcggcc 3540 cagcacgctg cacggactgt cgccaaagctccatcgccag taccgctctg cgcgatgcaa 3600 gtcggccggc aacatccctc tatcgccgctggcacacacg ccgtccccca cgcaggcgtc 3660 accgccgcca ctgccgggcc acacggtgggcagctcgcac actactcaga gcttcccggc 3720 caaactgcac tcatcgcctc ccgtcgtgcgcccgcgcccc aagagtgccg agccccctcg 3780 ctcgccgctc ctcaagcgcg tgcagtcggccgagaagctg ggagcctctt tgagtgcgga 3840 caagaagggc gcgctgcgca aacacagcctcgaggtgggc cacccggatt tccgcaagga 3900 cttccatggc gagctggcgc tgcatagccttgccgagtcc gacggtgaga cgcccccagt 3960 cgagggcgtt ggcgcgcccc ggcaggtcgccgtccgccgc ctgggccgac aggagtcacc 4020 tttgagcctg ggcgcggacc cgttgctgcccgagggtgcc tccaggccac cagtgtcgag 4080 caaggagaag gaatccccgg ggggcgccgaggcgtgcacc ccaccccgcg cgacgacccc 4140 cggtggccgg accctggagc gggacgtcggctgcacgcgg catcagagcg tgcagacgga 4200 ggatggcact ggcgggatgg ccagggctgtggccaaggcg gcgctgagcc cggtgcagga 4260 acacgagaca ggccggcgca gcagctctggcgaggcgggc acacccctgg tacccattgt 4320 cgtagagcct gcgcggcccg gggctaaggctgtggtgcct cagcctctgg gcgcggactc 4380 caaggggttg caggaacccg cacccctggcgccttccgtg cccgaggccc cccggggccg 4440 ggagcgctgg gtgttggagg tggtggaggagcgcaccacg ctgagcggtc ctcgctccaa 4500 gcccgcctcc ccaaagctct ccccggagccccagacaccc tccctagccc cagcgaagtg 4560 cagtgcaccc agcagtgcag tgaccccagtcccacccgca tccctcttgg gctcaggcac 4620 caagcctcaa gtggggctga cctcccggtgccctgctgaa gctgtgcccc cagcaggcct 4680 gaccaaaaaa ggagtgtcca gtcccgcacccccgggacca tagccaaggg ggtcatcggc 4740 9 4867 DNA Homo sapiens 9tgctccccgc gccgccgccg ccgccgcctc cgccgctgct gccgcacctg ccaccatgtc 60gccgccgccg ggtcatgtct gactctctct ggaccgcgct ttctaatttc tcgatgccct 120ccttccccgg cggcagtatg ttccgccgca ccaagagctg ccgcaccagt aatcggaaaa 180gcctcatcct gaccagcact tcacccacgc taccgagacc ccactccccg ctgccaggtc 240acctaggcag cagtcccctg gacagccccc gaaacttctc ccccaacacc cccgcccact 300tctcgtttgc ctcctcccga agggcggacg gacgccggtg gtctctggcc tcgctccctt 360catctggcta tggcaccaac acgcccagtt ccaccgtctc gtcctcctgc tcctcccagg 420agcgccttca ccagctgccc taccagccca cggtggacga gctccacttc ctctccaaac 480acttcgggag caccgagagc atcacagacg aggatggtgg ccgtcgctcc ccagccgtgc 540ggccccgctc acggagcctc agccccgggc gctccccctc ctcctacgac aacgagatcg 600tgatgatgaa tcacgtctac aaggagaggt tcccgaaggc cactgcgcag atggaggaga 660agctgcgcga ctttacccgc gcctacgaac ccgacagcgt tctgcctctg gccgatggcg 720tgctcagctt catccaccac cagatcatcg agctggcccg ggactgcctg accaagtccc 780gtgacggcct catcaccacg gtctacttct atgaattgca ggagaacctg gagaagctcc 840ttcaagacgc ctatgaacgc tctgagagct tggaggtggc cttcgttact cagctggtga 900agaagttgct tattatcatc tcacgccctg cgaggctgct ggagtgcctg gaattcaacc 960ccgaggagtt ctaccacctg ctggaggcgg ccgaaggaca cgccaaggag ggccaccttg 1020tgaagacgga catcccccgc tacatcatcc gccagctggg cctcacccgt gacccctttc 1080cagatgtggt gcatctggag gaacaggaca gtggtggttc caacacccct gagcaagacg 1140atctctctga gggccgcagc agcaaggcca agaaaccgcc gggggagaat gacttcgata 1200ccatcaagct cataagcaac ggtgcctacg gcgctgtcta cctggtgcgg caccgcgaca 1260cgcggcagcg ctttgccatg aaaaagatca acaagcagaa cttgatcctc cgcaaccaga 1320tccagcaggc ctttgtggag cgcgatatcc tcaccttcgc cgagaacccg tttgtggtcg 1380gcatgttctg ctcctttgag actcggcgcc acctctgcat ggtcatggaa tatgtggaag 1440gcggcgactg tgccaccctg ctgaagaata ttggagcgct gcccgtagag atggcccgca 1500tgtactttgc tgagacggtg ctagccctgg agtatttgca caactatggc atcgtgcacc 1560gcgacctcaa gcctacagcc tccttatcac ctccatgggt cacatcaagc tcacagattt 1620cggcctctcc aagatggggc tcatgagcct caccaccaac ttatatgaag gccacatcga 1680gaaggacgcc cgagagttcc tggacaaaca ggtgtgtggg accccagagt acatcgcgcc 1740cgaggtcatc ctgcgtcaag gctacggcaa gccagtggac tggtgggcta tggggatcat 1800cctctacgag ttcctggtgg gctgtgtgcc cttcttcgga gacacaccag aggagctatt 1860tggacaggtc atcagtgatg acatcctgtg gcccgagggg gatgaggccc tacctacgga 1920ggcccaactc ctcatatcca gcctcctgca gaccaaccct ctggtcaggc ttggggcagg 1980cggcgctttt gaggtgaagc agcacagttt ctttcgagac ctggactgga cagggctgct 2040gaggcagaag gccgagttca tcccccacct agagtcggaa gatgacacta gctactttga 2100cacccgctca gacaggtatc accacgtgaa ctcctatgac gaggatgaca cgacggagga 2160ggagcccgtg gaaatccgcc agttctcttc ctgctctccg cgcttcagca aggtgtatag 2220cagcatggag cagctgtcgc agcacgagcc caagacccca gtagcagctg cagggagcag 2280caagcgggag ccgagcacca agggccccga ggagaaggtg gccggcaagc gggaggggct 2340gggcggcctg accctgcgtg agaagacctg gagagggggc tctccggaga tcaagcgatt 2400ctccgcgtcc gaggccagtt tcctggaggg agaggccagt ccccctttgg gcgcccgccg 2460ccgtttctcg gcgctgctgg agcccagccg cttcagcgcc ccccaagagg acgaggatga 2520ggcccggctg cgcaggcctc cccggcccag ctccgacccc gcgggatccc tggatgcacg 2580ggcccccaaa gaggagactc aaggggaagg cacctccagc gccggggact ccgaggccac 2640tgaccgtcca cgcccaggtg acctctgccc accctcgaag gatggggatg catcaggccc 2700aagggctacc aatgacttgg ttctgcgccg ggcgcggcac cagcagatgt caggggatgt 2760ggcagtagag aagaggcctt ctcgaactgg gggcaaagtc atcaaatcag cctcagccac 2820tgccttatct gtcatgattc ctgcagtgga cccacatgga agttcacccc ttgctagtcc 2880catgtctcca cgatctctgt cctccaaccc atcctcacgg gactcctcac ccagccggga 2940ctactcacca gctgtcagtg ggctccgctc ccccatcacc atccagcgct cgggcaagaa 3000gtatggcttc acactgcgtg ccatccgtgt ctacatgggt gacacggatg tctatagtgt 3060ccaccacatt gtctggcatg tggaggaagg aggcccagcc caggaggcag gactctgtgc 3120tggggacctc atcacccacg tgaatgggga gcctgtgcat ggcatggtgc atcctgaggt 3180cgtggagctg atccttaaga gtggcaacaa ggtagcagtg accacaacgc ccttcgaaaa 3240tacctctatc cgcattggtc ccgcaaggcg cagcagctac aaggctaaaa tggctcggag 3300gaacaagcga ccctccgcca aggagggcca ggagagcaag aagcgcagct ccctcttccg 3360gaagatcacg aagcagtcga acctgctgca tactagccgc tcgctgtcgt cgctgaaccg 3420ctcgctgtca tccagcgata gtctcccggg ctcgcctacg cacgggctgc cggcgcgctc 3480gcccacgcac agctaccgct ccacgcctga ctccgcctac ctaggcgcct catcccagag 3540cagctcccca gcctcgagca cgcccaactc gcctgcgtcg tcggcgtcgc accacattcg 3600gcccagcacg ctgcacggac tgtcgccaaa gctccatcgc cagtaccgct ctgcgcgatg 3660caagtcggcc ggcaacatcc ctctatcgcc gctggcacac acgccgtccc ccacgcaggc 3720gtcaccgccg ccactgccgg gccacacggt gggcagctcg cacactactc agagcttccc 3780ggccaaactg cactcatcgc ctcccgtcgt gcgcccgcgc cccaagagtg ccgagccccc 3840tcgctcgccg ctcctcaagc gcgtgcagtc ggccgagaag ctgggagcct ctttgagtgc 3900ggacaagaag ggcgcgctgc gcaaacacag cctcgaggtg ggccacccgg atttccgcaa 3960ggacttccat ggcgagctgg cgctgcatag ccttgccgag tccgacggtg agacgccccc 4020agtcgagggc cttggcgcgc cccggcaggt cgccgtccgc cgcctgggcc gacaggagtc 4080acctttgagc ctgggcgcgg acccgttgct gcccgagggt gcctccaggc caccagtgtc 4140gagcaaggag aaggaatccc cggggggcgc cgaggcgtgc accccacccc gcgcgacgac 4200ccccggtggc cggaccctgg agcgggacgt cggctgcacg cggcatcaga gcgtgcagac 4260ggaggatggc actggcggga tggccagggc tgtggccaag gcggcgctga gcccggtgca 4320ggaacacgag acaggccggc gcagcagctc tggcgaggcg ggcacacccc tggtacccat 4380tgtcgtagag cctgcgcggc ccggggctaa ggctgtggtg cctcagcctc tgggcgcgga 4440ctccaagggg ttgcaggaac ccgcacccct ggcgccttcc gtgcccgagg ccccccgggg 4500ccgggagcgc tgggtgttgg aggtggtgga ggagcgcacc acgctgagcg gtcctcgctc 4560caagcccgcc tccccaaagc tctccccgga gccccagaca ccctccctag ccccagcgaa 4620gtgcagtgca cccagcagtg cagtgacccc agtcccaccc gcatccctct tgggctcagg 4680caccaagcct caagtggggc tgacctcccg gtgccctgct gaagctgtgc ccccagcagg 4740cctgaccaaa aaaggagtgt ccagtcccgc acccccggga ccatagccaa gggggtcatc 4800ggccccgcgc tgtacagcct ccgtatacat atgtacacat ataaataaag tgcgtccgtg 4860ctgcgtg 4867 10 5886 DNA Homo sapiens 10 ggacgagtcg agcctcctgcggcgccgcgg gctccagaag gagctgagcc tgccacgccg 60 aggacgtggc tgccgcagcgggaaccgcaa gagcttggtg gtaggaacgc cctccccgac 120 cctctcccgg cccctgtcgccattgtcggt cccaacggca ggcagcagcc ccttggatag 180 tcctcggaat ttctcggctgcctctgccct aaatttcccc tttgcccgga gggcagacgg 240 cagaagatgg tccctcgcgtctctcccatc ttccggctat ggaaccaaca cacccagctc 300 caccctctcg tcaagctcatcctcccggga acgtctccac cagcttccct tccagccgac 360 gccggacgag ctgcacttcctgtccaagca cttccgcagc tcagagaatg tgcttgatga 420 ggaaggcggc cggtcaccccgcctccgacc ccgctctcgc agtctcagcc cgggccgtgc 480 aacggggacc ttcgacaatgagattgtcat gatgaatcac gtgtaccggg agaggttccc 540 caaggccaca gcacagatggagggccgtct gcaggagttc ctgacggcct acgcgcccgg 600 cgcccggctg gcgctggctgatggcgtctt gggcttcatc caccaccaga tcgtcgagct 660 ggcccgagac tgcttggccaagtctggcga gaacctcgtc acctcccgct acttcctaga 720 gatgcaggag aagctggagcggcttctgca ggatgcccat gagcgttcgg acagtgagga 780 ggtcagcttc atcgtccagcttgtccggaa actgctgatc atcatctcac ggccagctcg 840 gctgctggag tgtctggagtttgaccctga ggaattttac cacctgctgg aggcggctga 900 gggccatgcg cgggagggccaaggcattaa gactgacctt ccacagtaca tcattgggca 960 gctgggcctg gccaaggaccccctggagga gatggtgcca ctgagtcacc tcgaagaaga 1020 acagccccca gcacctgagtccccagagag ccgcgccctg gtcggccagt cacggaggaa 1080 gccatgcgaa agcgactttgagaccatcaa actcattagc aacggagcct atggggccgt 1140 ctacctggtg cggcaccgtgacacacggca gcgctttgcc atcaagaaga tcaacaaaca 1200 gaacttgatc ctgcgtaaccagatccagca ggtctttgtg gagcgtgaca ttctcacctt 1260 tgccgagaac ccctttgtggtcagcatgtt ctgctccttt gagacccggc gccacctatg 1320 tatggtcatg gaatacgtggaaggcggcga ctgcgccacg ctcctgaaga acatgggccc 1380 gctgcccgtg gacatggcccgcctgtactt cgccgagacg gtgttggcgc tggagtacct 1440 gcataactat ggcatcgtgcaccgtgacct caaaccagac aatctgctca tcacctcgct 1500 tggccacatc aagctcacggacttcggcct gtccaagatc ggcctcatga gcatggccac 1560 caacctctat gagggccacatcgagaagga cgcccgagag ttcatcgaca agcaggtgtg 1620 tgggacgccg gagtacatagcccccgaggt gatcttccgc cagggctatg ggaagccagt 1680 ggactggtgg gccatgggcgtcgtcctcta tgagtttctg gtgggctgcg tgcctttctt 1740 tggagatacc cccgaggaactcttcggtca ggtggtcagc gatgagatca tgtggccaga 1800 gggagatgag gcccttccagcagacgccca ggacctcatc accaggttgc tccggcagag 1860 cccgctggac cgtctgggcactggtggcac ccacgaagtg aagcagcacc cctttttcct 1920 ggccctggac tgggcagggcttctccgaca caaagccgag ttcgtgcccc agctcgaagc 1980 tgaggatgat accagctactttgacacacg ttcggaacgt taccgccatc tgggctccga 2040 ggacgacgag accaatgatgaagaatcgtc cacagagatc ccccagttct cctcctgctc 2100 ccaccggttc agcaaggtctacagcagctc tgagttcctg gccgtccagc ccactcctac 2160 cttcgctgaa aggagcttcagtgaagaccg ggaggagggg tgggagcgca gcgaagtgga 2220 ctatggccgc cggctgagtgctgacatccg gctgaggtcc tggacatcct ctggatcctc 2280 ctgtcagtca tcttcgtcccagcccgagcg gggtcccagc ccatctctcc tgaataccat 2340 cagcctggac acaatgcccaagtttgcctt ctcatcagag gatgaggggg taggcccagg 2400 ccctgcaggc cccaagaggcccgtcttcat tctaggggag cctgaccccc caccagcggc 2460 caccccagtg atgcccaagccctcgagcct ttctgccgac acagctgctc tcagccacgc 2520 ccgcctacgg agcaatagcatcggcgcccg acactccaca ccaaggcctc tggatgccgg 2580 ccggggccgc cgccttgggggcccaagaga cccagcccct gagaagtcca gagcctcctc 2640 cagcggtggc agtggtggcggcagtggggg ccgcgtgccc aagtcagcct ctgtctctgc 2700 cctgtccctc atcatcacggcagatgatgg cagcggcggc cccctcatga gccccctttc 2760 cccgcgctct ctgtcctcgaacccgtcgtc ccgtgactct tcgccgagcc gagacccgtc 2820 ccccgtgtgt ggcagcctgcggccccccat cgttatccac agctctggca agaagtacgg 2880 cttcagcctg cgggcgatccgcgtctacat gggtgatagc gacgtctaca ctgtgcacca 2940 cgtcgtctgg agtgtggaggacggaagccc cgcccaggag gcgggcctgc gggctgggga 3000 cctcatcacc cacatcaacggggagtcagt gctggggctg gtgcacatgg acgtcgtgga 3060 gctgctgctg aagagcggcaacaagatatc cctgcggacc acagccctgg agaacacctc 3120 catcaaggtg ggccccgcccggaagaatgt ggccaagggc cgcatggcac gcaggagcaa 3180 gaggagccgt cggcgggagacccaggatcg gcggaagtca cttttcaaga agatctccaa 3240 gcagacctcc gtgctgcacaccagccgcag cttctcctcc ggactccacc actcactgtc 3300 atccagtgag agcctccccggctcgcccac ccacagcctc tcccccagcc ccaccactcc 3360 ctgccgaagc ccagcccctgatgtcccagc agataccact gcatccccac ccagcgcatc 3420 cccgagctcc agcagccccgcctccccagc tgctgctggc cacacccgcc ccagctccct 3480 gcacggcctg gctgccaagcttgggccacc ccgccccaag actggccgcc gcaagtccac 3540 cagcagcatc ccgccctccccgctggcctg cccgcccatc tccgcgcccc caccccgctc 3600 gccctcgccc ctgcccgggcacccgcccgc acctgcccga tccccgcggc tgcgccgggg 3660 ccagtcagct gacaagctgggcacagggga gcggctggat ggggaggcgg ggcggcgcac 3720 tcgtgggcca gaggccgagctcgtggtcat gcggcggctg cacctgtccg agcgccgaga 3780 ctccttcaag aagcaggaggccgtgcagga ggttagcttc gatgagccgc aggaggaggc 3840 cactgggctg cccacctcagtgccacagat cgccgtggag ggcgaggaag ccgtgccagt 3900 agctctcggg cccaccggaagagactgatc ccctgccagg tctctccctg gcatcaaagt 3960 tacgcgtttt cttgtgcaatgttttttccg taaagtcatg cctggatggg gactgagcca 4020 ccagcctgac acccagaaggcgagaagcca tctcggtcct tgctggaagg tggagacatc 4080 gcttgtgttc tggtgtcaatcggggctgga tggggcaaga atgggggaca agggtggctt 4140 tgtaaatagc agcaaatccctgcaactaat ttattacttt ttttttcttt tttttttttt 4200 ttttttgaga cagagtctcactctgttgcc cgggctggag tgcagcggcg tgatctcagc 4260 tcactgcaac ctccgcctcccaagttcaag cgattgtcct gcctcagctt cccaagtggc 4320 tgggattaca ggcgcccaccactatgccca gctaattttt tgtattttta gtacagacgg 4380 ggtttcacca tgttggtcaggctggtctcg aactcctgac ctcatgattt gcctgccttt 4440 gcctcccaaa gtgctgggattacaggcgtg agccactggg cccagcctaa tttattactt 4500 tttataagcg atagccgtactgagccgccc cctgaaggcg gctgccaggt cttgccccag 4560 gcacctggga ctctgtttgcaggccctgcc ctctgggctg agaaggatgc actttggaca 4620 agtcatctgt gtttgtgttttccagttttt ctgtactttt taagtgtttt gtgttacctg 4680 gtctcattcc cctccccacacctacccatt tgaggggatg gagttgaagt cacctggtca 4740 cctgtaccgg cccagttcggctacaacctg gagtgtccgt aaacaattcc tctcacccac 4800 aaaacaatgt aatcccagcgatggactgga ttctgaaggc cacttcccac catcatagct 4860 gccatgccca ggcagtgcctgctctatata tagagtctgc ctccaatcct gctggcttca 4920 gcctggagaa gggatatgggagctggagct ttgatggatg aataggtgtt caccggatct 4980 gggcagaggg gtcatccgctccccaggtgg gcactgataa aggaaggtac aggcctcacc 5040 tggaactgcc aaggcagcctccagaaatgc tcggctgtct cggggcacgc tccagtatgc 5100 cagtcctgcg ggattacgtccagctacttc cagaaacact cagtgtcccc tcccctcagg 5160 ctctgccttg gcctggccttgtccagtcta ccctggacaa gatgccgtgt gtttgaggcc 5220 cagcagagta agcccttggccgtgatgtgt ctgaaacacc tgttaggggt tccctccata 5280 tgtcagagcc tctctgggatgaagttcaag ccagaaaacc cagtcgaggc tcaagtttga 5340 atttcagctt cactgtgtggctctgggaaa atggctttcc cactctgtgc ctcagtttcc 5400 ttgtgtttac aagactaatcccattgactg tttattaagc acctactgtg tgccaagcgc 5460 ttttacgtgg cttctccctcagccagcctt gagaaggctg gaggtggtgt catcacctcc 5520 attttacaga caaagcagctgagaccccag cgaggggcgg agacctgtcc cacgatcacc 5580 cagcaggagt cgtggcagaacggagcatca gccagaccct gttgtgggcg ttgtcatcaa 5640 gggagcttga atggagggtctggtgtcaga tacagccgac tccagcccca gctcatcccc 5700 catgatgctg tgtgacccactgggcactct ggtgagggag ctttccagac atcaacagcc 5760 cactctgctt ccctttctgagtcccctgtc cagcactgcc tagtgttgga gggtagacca 5820 aggctgtgca tgattcaccccctccttcca tcctggagct ggcagtgaat aaaagcccgt 5880 atttac 5886 11 7826 DNAHomo sapiens 11 agaacgcacg gctcaactca tgtaattact gtttatagct ggccgagcctgactaggaga 60 gggcagaccc gagaggaaat cagtttcccg gacctttgag aggaggctgtgtgttaatta 120 aaggctagga cgggacgggt acttctcaga catgctccaa gttgttcttgagatcacagt 180 tcccatcaca ttttctctgg agggagtgag tagataattg ggattttttttttatttttg 240 gccttgtctt tcttcctttt ttttacctct ccccatttta gtcatatggccttgaaccca 300 cagtgaattg aagagagaaa gaaatggata tgtctgaccc caatttttggactgtgctct 360 caaactttac tttgcctcat ttgaggagtg ggaacaggct tcggcgaacacaaagttgcc 420 gaacaagcaa ccggaaaagc ttaataggca atgggcagtc accagcattgcctcgaccac 480 actcacctct ctctgctcat gcaggaaata gccctcaaga tagtccaagaaatttctccc 540 ccagtgcctc agcccatttt tcatttgcac ggaggactga tggacgccgctggtcgttgg 600 cttctctccc ttcctctggc tatgggacaa acacacccag ctctacggtctcttcatcct 660 gttcctccca ggagaagttg catcagttac cataccaacc aacaccagacgagttacact 720 tcttatcaaa acatttctgt accaccgaaa gcatcgccac tgagaacagatgcaggaaca 780 cgccgatgcg cccccgttcc cgaagtctga gccctggacg ttctcccgcctgctgtgacc 840 atgaaataat tatgatgaac catgtctaca aagaaaggtt cccaaaggctacagctcaga 900 tggaagaacg tctaaaggaa attatcacca gctactctcc tgacaacgttctacccttag 960 cagatggagt gcttagtttc actcaccacc agattattga actggctcgagattgcttgg 1020 ataaatccca ccagggcctc atcacctcac gatacttcct tgaattacagcacaaattag 1080 ataagttgct acaggaggct catgatcgtt cagaaagtgg agaattggcatttattaaac 1140 aactagttcg aaagatccta attgttattg cccgccctgc tcggttattagagtgcctgg 1200 aatttgatcc ggaagaattt tactacctat tggaagcagc agaaggccatgccaaagaag 1260 gacagggtat taaaaccgac attcccaggt acatcattag ccaactgggactcaataagg 1320 atcccttgga agaaatggct catttgggaa actacgatag tgggacagcagaaacaccag 1380 aaacagatga atcagtgagt agctctaatg cctccctgaa acttcgaaggaaacctcggg 1440 aaagtgattt tgaaacgatt aaattgatta gcaatggagc ctatggggcagtctactttg 1500 ttcggcataa agaatcccgg cagaggtttg ccatgaagaa gattaataaacagaacctca 1560 tccttcgaaa ccagatccag caggcctttg tggagcggga tatcctgacttttgcagaaa 1620 acccctttgt tgtcagcatg tattgctcct ttgaaacaag gcgccacttgtgcatggtca 1680 tggaatatgt ggaaggggga gactgtgcta ctttaatgaa aaacatgggtcctctccctg 1740 ttgatatggc cagaatgtac tttgctgaga cggtcttggc cttggaatatttacataatt 1800 atggaattgt acacagggat ttgaaaccag acaacttgtt ggttacctccatggggcaca 1860 taaagctgac agattttgga ttatctaagg tgggactaat gagcatgactaccaaccttt 1920 acgagggtca tattgagaag gatgctagag agttcctgga taaacaggtctgtggcacac 1980 ctgaatacat tgcaccagaa gtgattctga ggcagggtta tggaaagccggtggactggt 2040 gggccatggg gattatcctc tatgaatttc tggttggatg cgtgccattctttggggata 2100 ctccagagga gctatttgga caagtcatca gtgatgagat caactggcctgagaaggatg 2160 aggcaccccc acctgatgcc caggatctga ttaccttact cctcaggcagaatcccctgg 2220 agaggctggg aacaggtggt gcatatgaag tcaaacagca tcgattcttccgttctttag 2280 actggaacag tttgctgaga cagaaggcag aatttattcc ccaactggaatctgaggatg 2340 acacaagtta ttttgatact cggtctgaga agtatcatca tatggaaacggaggaagaag 2400 atgacacaaa tgatgaagac tttaatgtgg aaataaggca gttttcttcatgttcacaca 2460 ggttttcaaa agttttcagc agtatagatc gaatcactca gaattcagcagaagagaagg 2520 aagactctgt ggacaaaacc aaaagcacca ccttgccatc cacagaaacactgagctgga 2580 gttcagaata ttctgaaatg caacagctat caacatccaa ctcttcagatactgaaagca 2640 acagacataa actcagttct ggcctacttc ccaaactggc tatttcaacagagggagagc 2700 aagatgaagc tgcctcctgc cctggagacc cccatgagga gccaggaaagccagcccttc 2760 ctcctgaaga gtgtgcccag gaggagcctg aggtcaccac cccagccagcaccatcagca 2820 gctccaccct gtcagttggc agtttttcag agcacttgga tcagataaatggacgaagcg 2880 agtgtgtgga cagtacagat aattcctcaa agccatccag tgaacccgcttctcacatgg 2940 ctcggcagcg attagaaagc acagaaaaaa agaaaatctc ggggaaagtcacaaagtccc 3000 tctctgccag tgctctttcc ctcatgatcc caggagatat gtttgctgtttcccctctgg 3060 gaagtccaat gtctccccat tccctgtcct cggacccttc ttcttcacgagattcctctc 3120 ccagccgaga ttcctcagca gcttctgcca gtccacatca gccgattgtgatccacagtt 3180 cggggaagaa ctacggcttt accatccgag ccatccgggt gtatgtgggagacagtgaca 3240 tctatacagt gcaccatatc gtctggaatg tagaagaagg aagtccggcatgccaggcag 3300 gactgaaggc tggagatctt atcactcaca tcaatggaga accagtgcatggacttgtcc 3360 acacagaagt tatagaactc ctactgaaga gtgggaataa ggtgtcaatcactactaccc 3420 catttgaaaa cacatcaatc aaaactggac cagccaggag aaacagctataagagccgga 3480 tggtgaggcg gagcaagaaa tccaagaaga aagaaagtct cgaaaggaggagatctcttt 3540 tcaaaaagct agccaagcag ccttctcctt tactccacac cagccgaagtttctcctgct 3600 tgaacagatc cctgtcatcg ggtgagagcc tcccaggttc ccccactcatagcttgtctc 3660 cccggtctcc aacaccaagc taccgctcca cccctgactt cccatctggtactaattcct 3720 cccagagcag ctcccctagt tctagtgccc ccaattcccc agcagggtccgggcacatcc 3780 ggcccagcac tctccacggt cttgcaccca aactcggcgg gcagcggtaccggtccggaa 3840 ggcgaaagtc cgccggcaac atcccactgt ccccgctggc ccggacgccctctccaaccc 3900 cgcaacccac ctccccgcag cggtcaccat cccctcttct gggacactcactgggcaatt 3960 ccaagatcgc gcaagccttt cccagcaaga tgcactcccc gcccaccatcgtcagacaca 4020 tcgtgaggcc caagagtgcg gagcccccca ggtccccgct gctcaagcgcgtgcagtccg 4080 aggagaagct gtcgccctct tacggcagtg acaagaagca cctgtgctcccgcaagcaca 4140 gcctggaggt gacccaagag gaggtgcagc gggagcagtc ccagcgggaggcgccgctgc 4200 agagcctgga tgagaacgtg tgcgacgtgc cgccgctcag ccgcgcccggccagtggagc 4260 aaggctgcct gaaacgccca gtctcccgga aggtgggccg ccaggagtctgtggacgacc 4320 tggaccgcga caagctgaag gccaaggtgg tggtgaagaa agcagacggcttcccagaga 4380 aacaggaatc ccaccagaaa tcccatggac ccgggagtga tttggaaaactttgctctgt 4440 ttaagctgga agagagagag aagaaagtct atccgaaggc tgtggaaaggtcaagtactt 4500 ttgaaaacaa agcgtctatg caggaggcgc caccgctggg cagcctgctgaaggatgctc 4560 ttcacaagca ggccagcgtg cgcgccagcg agggtgcgat gtcggatggcccggtgcctg 4620 cggagcaccg ccagggtggc ggggacttca gacgggcccc cgctcctggcaccctccagg 4680 atggtctctg ccactccctc gacaggggca tctctgggaa gggggaaggcacggagaagt 4740 cctcccaggc caaggagctt ctccgatgtg aaaagttaga cagcaagctggccaacatcg 4800 attacctccg aaagaaaatg tcacttgagg acaaagagga caacctctgccctgtgctga 4860 agcccaagat gacagctggc tcccacgaat gcctgccagg gaacccagtccgacccacgg 4920 gtgggcagca ggagcccccg ccggcttctg agagccgagc ttttgtcagcagcacccatg 4980 cagctcagat gagtgccgtc tcttttgttc ccctcaaggc cttaacaggccgggtggaca 5040 gtggaacgga gaagcctggc ttggttgctc ctgagtcccc tgttaggaagagcccctccg 5100 agtataagct ggaaggtagg tctgtctcat gcctgaagcc gatcgagggcactctggaca 5160 ttgctctcct gtccggacct caggcctcca agacagaact gccttccccagagtctgcac 5220 agagccccag cccaagtggt gacgtgaggg cctctgtgcc accagttctccccagcagca 5280 gtgggaaaaa gaacgatacc accagtgcaa gagagctttc tccttccagcttaaagatga 5340 ataaatccta cctgctggag ccttggttcc tgccccccag ccgaggtctccagaattcac 5400 cagcagtttc cctgcctgac ccagagttca agagggacag gaaaggtccccatcctactg 5460 ccaggagccc tggaacagtc atggaaagca atccccaaca gagagagggcagctccccta 5520 aacaccaaga ccacaccact gaccccaagc ttctgacctg cctggggcagaacctccaca 5580 gccctgacct ggccaggcca cgctgcccgc tcccacctga agcttccccctcaagggaga 5640 agccaggcct gagggaatcg tctgaaagag gccctcccac agccagaagcgagcgctctg 5700 ctgcgagggc tgacacatgc agagagccct ccatggaact gtgctttccagaaactgcga 5760 aaaccagtga caactccaaa aatctcctct ctgtgggaag gacccacccagatttctata 5820 cacagaccca ggccatggag aaagcatggg cgccgggtgg gaaaacgaaccacaaagatg 5880 gcccaggtga ggcgaggccc ccgcccagag acaactcctc tctgcactcagctggaattc 5940 cctgtgagaa ggagctgggc aaggtgaggc gtggcgtgga acccaagcccgaagcgcttc 6000 ttgccaggcg gtctctgcag ccacctggaa ttgagagtga gaagagtgaaaagctctcca 6060 gtttcccatc tttgcagaaa gatggtgcca aggaacctga aaggaaggagcagcctctac 6120 aaaggcatcc cagcagcatc cctccgcccc ctctgacggc caaagacctgtccagcccgg 6180 ctgccaggca gcattgcagt tccccaagcc acgcttctgg cagagagccgggggccaagc 6240 ccagcactgc agagcccagc tcgagccccc aggaccctcc caagcctgttgctgcgcaca 6300 gtgaaagcag cagccacaag ccccggcctg gccctgaccc gggccctccaaagactaagc 6360 accccgaccg gtccctctcc tctcagaaac caagtgtcgg ggccacaaagggcaaagagc 6420 ctgccactca gtccctcggt ggctctagca gagaggggaa gggccacagtaagagtgggc 6480 cggatgtgtt tcctgctacc ccaggctccc agaacaaagc cagcgatgggattggccagg 6540 gagaaggtgg gccctctgtc ccactgcaca ctgacagggc tcctctagacgccaagccac 6600 aacccaccag tggtgggcgg cccctggagg tgctggagaa gcctgtgcatttgccaaggc 6660 cgggacaccc agggcctagt gagccagcgg accagaaact gtccgctgttggtgaaaagc 6720 aaaccctgtc tccaaagcac cccaaaccat ccactgtgaa agattgccccaccctgtgca 6780 aacagacaga caacagacag acagacaaaa gcccgagtca gccggccgccaacaccgaca 6840 gaagggcgga agggaagaaa tgcactgaag cactttatgc tccagcagagggcgacaagc 6900 tcgaggccgg cctttccttt gtgcatagcg agaaccggtt gaaaggcgcggagcggccag 6960 ccgcgggggt ggggaagggc ttccctgagg ccagagggaa agggcccggtccccagaagc 7020 caccgacgga ggcagacaag cccaatggca tgaaacggtc cccctcagccactgggcaga 7080 gttctttccg atccacggcc ctcccggaaa agtctctgag ctgctcctccagcttccctg 7140 aaaccagggc cggagttaga gaggcctctg cagccagcag cgacacctcttctgccaagg 7200 ccgccggggg catgctggag cttccagccc ccagcaacag ggaccataggaaggctcagc 7260 ctgccgggga gggccgaacc cacatgacaa agagtgactc cctgccctccttccgggtct 7320 ccaccctgcc tctggagtca caccaccccg acccaaacac catgggcggggccagccacc 7380 gggacagggc tctctcggtg actgccaccg taggggaaac caaagggaaggaccctgccc 7440 cagcccagcc tcccccagct aggaaacaga acgtgggcag agacgtgaccaagccatccc 7500 cagccccaaa cactgaccgc cccatctctc tttctaatga gaaggactttgtggtacggc 7560 agaggcgggg gaaagagagt ttgcgtagca gccctcacaa aaaggccttgtaacggggag 7620 ggcccagggg caggactgtg gagacccgtc ctgaacgggc gactgtgtcttgactacctt 7680 tcaaaaccag cactgtgtgg gaatgtccgc caggcagagc tcggagcctcattgagacag 7740 gggagagaga aagacaaaga ggggaccttc ttccagatgc cttcccagttgtaaccggta 7800 aaactgttac cagatagtgt ttgtac 7826 12 1005 DNA Homosapiens 12 ggcacgaggg tccagcccgg ctgccaggca gcattgcagt tccccaagccacgcttctgg 60 cagagagccg ggggccaagc ccagcactgc agagcccagc tcgagcccccaggaccctcc 120 caagcctgtt gctgcgcaca gtgaaagcag cagccacaaa gggcaaagagcctgccaccg 180 acggaggcag acaagcccaa tggcatgaaa cggtccccct cagccactgggcagagttct 240 ttccgatcca cggccctccc ggaaaagtct ctgagctgct cctccagcttccctgaaacc 300 agggccggag ttagagaggc ctctgcagcc agcagcgaca cctcttctgccaaggccgcc 360 gggggcatgc tggagcttcc agcccccagc aacagggacc ataggaaggctcagcctgcc 420 ggggagggcc gaacccacat gacaaagagt gactccctgc cctccttccgggtctccacc 480 ctgcctctgg agtcacacca ccccgaccca aacaccatgg gcggggccagccaccgggac 540 agggctctct cggtgactgc caccgtaggg gaaaccaaag ggaaggaccctgccccagcc 600 cagcctcccc cagctaggaa acagaacgtg ggcagagacg tgaccaagccatccccagcc 660 ccaaacactg accgccccat ctctctttct aatgagaagg actttgtggtacggcggagg 720 cgggggaaag agagtttgcg tagcagccct cacaaaaagg ccttgtaacggggagggccc 780 aggggcagga ctgtggagac ccgtcctgaa cgggcgactg tgtcttgactacctttcaaa 840 accagcactg tgtgggaatg tccgccaggc agagctcgga gcctcattgagacaggggag 900 agagaaagac aaagagggga ccttcttcca gatgccttcc cagttgtaaccggtaaaact 960 gttaccagat agtgtttgta caaaaaaaaa aaaaaaaaaa aaaaa 1005 136629 DNA Homo sapiens 13 tggaatttga tccggaagaa ttttactacc tattggaagcagcagaaggc catgccaaag 60 aaggacaggg tattaaaacc gacattccca ggtacatcattagccaactg ggactcaata 120 aggatccctt ggaagaaatg gctcatttgg gaaactacgatagtgggaca gcagaaacac 180 cagaaacaga tgaatcagtg agtagctcta atgcctccctgaaacttcga aggaaacctc 240 gggaaagtga ttttgaaacg attaaattga ttagcaatggagcctatggg gcagtctact 300 ttgttcggca taaagaatcc cggcagaggt ttgccatgaagaagattaat aaacagaacc 360 tcatccttcg aaaccagatc cagcaggcct ttgtggagcgggatatcctg acttttgcag 420 aaaacccctt tgttgtcagc atgtattgct cctttgaaacaaggcgccac ttgtgcatgg 480 tcatggaata tgtggaaggg ggagactgtg ctactttaatgaaaaacatg ggtcctctcc 540 ctgttgatat ggccagaatg tactttgctg agacggtcttggccttggaa tatttacata 600 attatggaat tgtacacagg gatttgaaac cagacaacttgttggttacc tccatggggc 660 acataaagct gacagatttt ggattatcta aggtgggactaatgagcatg actaccaacc 720 tttacgaggg tcatattgag aaggatgcta gagagttcctggataaacag gtctgtggca 780 cacctgaata cattgcacca gaagtgattc tgaggcagggttatggaaag ccggtggact 840 ggtgggccat ggggattatc ctctatgaat ttctggttggatgcgtgcca ttctttgggg 900 atactccaga ggagctattt ggacaagtca tcagtgatgagatcaactgg cctgagaagg 960 atgaggcacc cccacctgat gcccaggatc tgattaccttactcctcagg cagaatcccc 1020 tggagaggct gggaacaggt ggtgcatatg aagtcaaacagcatcgattc ttccgttctt 1080 tagactggaa cagtttgctg agacagaagg cagaatttattccccaactg gaatctgagg 1140 atgacacaag ttattttgat actcggtctg agaagtatcatcatatggaa acggaggaag 1200 aagatgacac aaatgatgaa gactttaatg tggaaataaggcagttttct tcatgttcac 1260 acaggttttc aaaagttttc agcagtatag atcgaatcactcagaattca gcagaagaga 1320 aggaagactc tgtggacaaa accaaaagca ccaccttgccatccacagaa acactgagct 1380 ggagttcaga atattctgaa atgcaacagc tatcaacatccaactcttca gatactgaaa 1440 gcaacagaca taaactcagt tctggcctac ttcccaaactggctatttca acagagggag 1500 agcaagatga agctgcctcc tgccctggag acccccatgaggagccagga aagccagccc 1560 ttcctcctga agagtgtgcc caggaggagc ctgaggtcaccaccccagcc agcaccatca 1620 gcagctccac cctgtcagtt ggcagttttt cagagcacttggatcagata aatggacgaa 1680 gcgagtgtgt ggacagtaca gataattcct caaagccatccagtgaaccc gcttctcaca 1740 tggctcggca gcgattagaa agcacagaaa aaaagaaaatctcggggaaa gtcacaaagt 1800 ccctctctgc cagtgctctt tccctcatga tcccaggagatatgtttgct gtttcccctc 1860 tgggaagtcc aatgtctccc cattccctgt cctcggacccttcttcttca cgagattcct 1920 ctcccagccg agattcctca gcagcttctg ccagtccacatcagccgatt gtgatccaca 1980 gttcggggaa gaactacggc tttaccatcc gagccatccgggtgtatgtg ggagacagtg 2040 acatctatac agtgcaccat atcgtctgga atgtagaagaaggaagtccg gcatgccagg 2100 caggactgaa ggctggagat cttatcactc acatcaatggagaaccagtg catggacttg 2160 tccacacaga agttatagaa ctcctactga agagtgggaataaggtgtca atcactacta 2220 ccccatttga aaacacatca atcaaaactg gaccagccaggagaaacagc tataagagcc 2280 ggatggtgag gcggagcaag aaatccaaga agaaagaaagtctcgaaagg aggagatctc 2340 ttttcaaaaa gctagccaag cagccttctc ctttactccacaccagccga agtttctcct 2400 gcttgaacag atccctgtca tcgggtgaga gcctcccaggttcccccact catagcttgt 2460 ctccccggtc tccaacacca agctaccgct ccacccctgacttcccatct ggtactaatt 2520 cctcccagag cagctcccct agttctagtg cccccaattccccagcaggg tccgggcaca 2580 tccggcccag cactctccac ggtcttgcac ccaaactcggcgggcagcgg taccggtccg 2640 gaaggcgaaa gtccgccggc aacatcccac tgtccccgctggcccggacg ccctctccaa 2700 ccccgcaacc cacctccccg cagcggtcac catcccctcttctgggacac tcactgggca 2760 attccaagat cgcgcaagcc tttcccagca agatgcactccccgcccacc atcgtcagac 2820 acatcgtgag gcccaagagt gcggagcccc ccaggtccccgctgctcaag cgcgtgcagt 2880 ccgaggagaa gctgtcgccc tcttacggca gtgacaagaagcacctgtgc tcccgcaagc 2940 acagcctgga ggtgacccaa gaggaggtgc agcgggagcagtcccagcgg gaggcgccgc 3000 tgcagagcct ggatgagaac gtgtgcgacg tgccgccgctcagccgcgcc cggccagtgg 3060 agcaaggctg cctgaaacgc ccagtctccc ggaaggtgggccgccaggag tctgtggacg 3120 acctggaccg cgacaagctg aaggccaagg tggtggtgaagaaagcagac ggcttcccag 3180 agaaacagga atcccaccag aaatcccatg gacccgggagtgatttggaa aactttgctc 3240 tgtttaagct ggaagagaga gagaagaaag tctatccgaaggctgtggaa aggtcaagta 3300 cttttgaaaa caaagcgtct atgcaggagg cgccaccgctgggcagcctg ctgaaggatg 3360 ctcttcacaa gcaggccagc gtgcgcgcca gcgagggtgcgatgtcggat ggcccggtgc 3420 ctgcggagca ccgccagggt ggcggggact tcagacgggcccccgctcct ggcaccctcc 3480 aggatggtct ctgccactcc ctcgacaggg gcatctctgggaagggggaa ggcacggaga 3540 agtcctccca ggccaaggag cttctccgat gtgaaaagttagacagcaag ctggccaaca 3600 tcgattacct ccgaaagaaa atgtcacttg aggacaaagaggacaacctc tgccctgtgc 3660 tgaagcccaa gatgacagct ggctcccacg aatgcctgccagggaaccca gtccgaccca 3720 cgggtgggca gcaggagccc ccgccggctt ctgagagccgagcttttgtc agcagcaccc 3780 atgcagctca gatgagtgcc gtctcttttg ttcccctcaaggccttaaca ggccgggtgg 3840 acagtggaac ggagaagcct ggcttggttg ctcctgagtcccctgttagg aagagcccct 3900 ccgagtataa gctggaaggt aggtctgtct catgcctggagccgatcgag ggcactctgg 3960 acattgctct cctgtccgga cctcaggcct ccaagacagaactgccttcc ccagagtctg 4020 cacagagccc cagcccaagt ggtgacgtga gggcctctgtgccaccagtt ctccccagca 4080 gcagtgggaa aaagaacgat accaccagtg caagagagctttctccttcc agcttaaaga 4140 tgaataaatc ctacctgctg gagccttggt tcctgccccccagccgaggt ctccagaatt 4200 caccagcagt ttccctgcct gacccagagt tcaagagggacaggaaaggt ccccatccta 4260 ctgccaggag ccctggaaca gtcatggaaa gcaatccccaacagagagag ggcagctccc 4320 ctaaacacca agaccacacc actgacccca agcttctgacctgcctgggg cagaacctcc 4380 acagccctga cctggccagg ccacgctgcc cgctcccacctgaagcttcc ccctcaaggg 4440 agaagccagg cctgagggaa tcgtctgaaa gaggccctcccacagccaga agcgagcgct 4500 ctgctgcgag ggctgacaca tgcagagagc cctccatggaactgtgcttt ccagaaactg 4560 cgaaaaccag tgacaactcc aaaaatctcc tctctgtgggaaggacccac ccagatttct 4620 atacacagac ccaggccatg gagaaagcat gggcgccgggtgggaaaacg aaccacaaag 4680 atggcccagg tgaggcgagg cccccgccca gagacaactcctctctgcac tcagctggaa 4740 ttccctgtga gaaggagctg ggcaaggtga ggcgtggcgtggaacccaag cccgaagcgc 4800 ttcttgccag gcggtctctg cagccacctg gaattgagagtgagaagagt gaaaagctct 4860 ccagtttccc atctttgcag aaagatggtg ccaaggaacctgaaaggaag gagcagcctc 4920 tacaaaggca tcccagcagc atccctccgc cccctctgacggccaaagac ctgtccagcc 4980 cggctgccag gcagcattgc agttccccaa gccacgcttctggcagagag ccgggggcca 5040 agcccagcac tgcagagccc agctcgagcc cccaggaccctcccaagcct gttgctgcgc 5100 acagtgaaag cagcagccac aagccccggc ctggccctgacccgggccct ccaaagacta 5160 agcaccccga ccggtccctc tcctctcaga aaccaagtgtcggggccaca aagggcaaag 5220 agcctgccac tcagtccctc ggtggctcta gcagagaggggaagggccac agtaagagtg 5280 ggccggatgt gtttcctgct accccaggct cccagaacaaagccagcgat gggattggcc 5340 agggagaagg tgggccctct gtcccactgc acactgacagggctcctcta gacgccaagc 5400 cacaacccac cagtggtggg cggcccctgg aggtgctggagaagcctgtg catttgccaa 5460 ggccgggaca cccagggcct agtgagccag cggaccagaaactgtccgct gttggtgaaa 5520 agcaaaccct gtctccaaag caccccaaac catccactgtgaaagattgc cccaccctgt 5580 gcaaacagac agacaacaga cagacagaca aaagcccgagtcagccggcc gccaacaccg 5640 acagaagggc ggaagggaag aaatgcactg aagcactttatgctccagca gagggcgaca 5700 agctcgaggc cggcctttcc tttgtgcata gcgagaaccggttgaaaggc gcggagcggc 5760 cagccgcggg ggtggggaag ggcttccctg aggccagagggaaagggccc ggtccccaga 5820 agccaccgac ggaggcagac aagcccaatg gcatgaaacggtccccctca gccactgggc 5880 agagttcttt ccgatccacg gccctcccgg aaaagtctctgagctgctcc tccagcttcc 5940 ctgaaaccag ggccggagtt agagaggcct ctgcagccagcagcgacacc tcttctgcca 6000 aggccgccgg gggcatgctg gagcttccag cccccagcaacagggaccat aggaaggctc 6060 agcctgccgg ggagggccga acccacatga caaagagtgactccctgccc tccttccggg 6120 tctccaccct gcctctggag tcacaccacc ccgacccaaacaccatgggc ggggccagcc 6180 accgggacag ggctctctcg gtgactgcca ccgtaggggaaaccaaaggg aaggaccctg 6240 ccccagccca gcctccccca gctaggaaac agaacgtgggcagagacgtg accaagccat 6300 ccccagcccc aaacactgac cgccccatct ctctttctaatgagaaggac tttgtggtac 6360 ggcagaggcg ggggaaagag agtttgcgta gcagccctcacaaaaaggcc ttgtaacggg 6420 gagggcccag gggcaggact gtggagaccc gtcctgaacgggcgactgtg tcttgactac 6480 ctttcaaaac cagcactgtg tgggaatgtc cgccaggcagagctcggagc ctcattgaga 6540 caggggagag agaaagacaa agaggggacc ttcttccagatgccttccca gttgtaaccg 6600 gtaaaactgt taccagatag tgtttgtac 6629 14 1734PRT Homo sapiens 14 Met Lys Arg Ser Arg Cys Arg Asp Arg Pro Gln Pro ProPro Pro Asp 1 5 10 15 Arg Arg Glu Asp Gly Val Gln Arg Ala Ala Glu LeuSer Gln Ser Leu 20 25 30 Pro Pro Arg Arg Arg Ala Pro Pro Gly Arg Gln ArgLeu Glu Glu Arg 35 40 45 Thr Gly Pro Ala Gly Pro Glu Gly Lys Glu Gln AspVal Val Thr Gly 50 55 60 Val Ser Pro Leu Leu Phe Arg Lys Leu Ser Asn ProAsp Ile Phe Ser 65 70 75 80 Ser Thr Gly Lys Val Lys Leu Gln Arg Gln LeuSer Gln Asp Asp Cys 85 90 95 Lys Leu Trp Arg Gly Asn Leu Ala Ser Ser LeuSer Gly Lys Gln Leu 100 105 110 Leu Pro Leu Ser Ser Ser Val His Ser SerVal Gly Gln Val Thr Trp 115 120 125 Gln Ser Ser Gly Glu Ala Ser Asn LeuVal Arg Met Arg Asn Gln Ser 130 135 140 Leu Gly Gln Ser Ala Pro Ser LeuThr Ala Gly Leu Lys Glu Leu Ser 145 150 155 160 Leu Pro Arg Arg Gly SerPhe Cys Arg Thr Ser Asn Arg Lys Ser Leu 165 170 175 Ile Val Thr Ser SerThr Ser Pro Thr Leu Pro Arg Pro His Ser Pro 180 185 190 Leu His Gly HisThr Gly Asn Ser Pro Leu Asp Ser Pro Arg Asn Phe 195 200 205 Ser Pro AsnAla Pro Ala His Phe Ser Phe Val Pro Ala Arg Arg Thr 210 215 220 Asp GlyArg Arg Trp Ser Leu Ala Ser Leu Pro Ser Ser Gly Tyr Gly 225 230 235 240Thr Asn Thr Pro Ser Ser Thr Val Ser Ser Ser Cys Ser Ser Gln Glu 245 250255 Lys Leu His Gln Leu Pro Phe Gln Pro Thr Ala Asp Glu Leu His Phe 260265 270 Leu Thr Lys His Phe Ser Thr Glu Ser Val Pro Asp Glu Glu Gly Arg275 280 285 Gln Ser Pro Ala Met Arg Pro Arg Ser Arg Ser Leu Ser Pro GlyArg 290 295 300 Ser Pro Val Ser Phe Asp Ser Glu Ile Ile Met Met Asn HisVal Tyr 305 310 315 320 Lys Glu Arg Phe Pro Lys Ala Thr Ala Gln Met GluGlu Arg Leu Ala 325 330 335 Glu Phe Ile Ser Ser Asn Thr Pro Asp Ser ValLeu Pro Leu Ala Asp 340 345 350 Gly Ala Leu Ser Phe Ile His His Gln ValIle Glu Met Ala Arg Asp 355 360 365 Cys Leu Asp Lys Ser Arg Ser Gly LeuIle Thr Ser Gln Tyr Phe Tyr 370 375 380 Glu Leu Gln Glu Asn Leu Glu LysLeu Leu Gln Asp Ala His Glu Arg 385 390 395 400 Ser Glu Ser Ser Glu ValAla Phe Val Met Gln Leu Val Lys Lys Leu 405 410 415 Met Ile Ile Ile AlaArg Pro Ala Arg Leu Leu Glu Cys Leu Glu Phe 420 425 430 Asp Pro Glu GluPhe Tyr His Leu Leu Glu Ala Ala Glu Gly His Ala 435 440 445 Lys Glu GlyGln Gly Ile Lys Cys Asp Ile Pro Arg Tyr Ile Val Ser 450 455 460 Gln LeuGly Leu Thr Arg Asp Pro Leu Glu Glu Met Ala Gln Leu Ser 465 470 475 480Ser Cys Asp Ser Pro Asp Thr Pro Glu Thr Asp Asp Ser Ile Glu Gly 485 490495 His Gly Ala Ser Leu Pro Ser Lys Lys Thr Pro Ser Glu Glu Asp Phe 500505 510 Glu Thr Ile Lys Leu Ile Ser Asn Gly Ala Tyr Gly Ala Val Phe Leu515 520 525 Val Arg His Lys Ser Thr Arg Gln Arg Phe Ala Met Lys Lys IleAsn 530 535 540 Lys Gln Asn Leu Ile Leu Arg Asn Gln Ile Gln Gln Ala PheVal Glu 545 550 555 560 Arg Asp Ile Leu Thr Phe Ala Glu Asn Pro Phe ValVal Ser Met Phe 565 570 575 Cys Ser Phe Asp Thr Lys Arg His Leu Cys MetVal Met Glu Tyr Val 580 585 590 Glu Gly Gly Asp Cys Ala Thr Leu Leu LysAsn Ile Gly Ala Leu Pro 595 600 605 Val Asp Met Val Arg Leu Tyr Phe AlaGlu Thr Val Leu Ala Leu Glu 610 615 620 Tyr Leu His Asn Tyr Gly Ile ValHis Arg Asp Leu Lys Pro Asp Asn 625 630 635 640 Leu Leu Ile Thr Ser MetGly His Ile Lys Leu Thr Asp Phe Gly Leu 645 650 655 Ser Lys Met Gly LeuMet Ser Leu Thr Thr Asn Leu Tyr Glu Gly His 660 665 670 Ile Glu Lys AspAla Arg Glu Phe Leu Asp Lys Gln Val Cys Gly Thr 675 680 685 Pro Glu TyrIle Ala Pro Glu Val Ile Leu Arg Gln Gly Tyr Gly Lys 690 695 700 Pro ValAsp Trp Trp Ala Met Gly Ile Ile Leu Tyr Glu Phe Leu Val 705 710 715 720Gly Cys Val Pro Phe Phe Gly Asp Thr Pro Glu Glu Leu Phe Gly Gln 725 730735 Val Ile Ser Asp Glu Ile Val Trp Pro Glu Gly Asp Glu Ala Leu Pro 740745 750 Pro Asp Ala Gln Asp Leu Thr Ser Lys Leu Leu His Gln Asn Pro Leu755 760 765 Glu Arg Leu Gly Thr Gly Ser Ala Tyr Glu Val Lys Gln His ProPhe 770 775 780 Phe Thr Gly Leu Asp Trp Thr Gly Leu Leu Arg Gln Lys AlaGlu Phe 785 790 795 800 Ile Pro Gln Leu Glu Ser Glu Asp Asp Thr Ser TyrPhe Asp Thr Arg 805 810 815 Ser Glu Arg Tyr His His Met Asp Ser Glu AspGlu Glu Glu Val Ser 820 825 830 Glu Asp Gly Cys Leu Glu Ile Arg Gln PheSer Ser Cys Ser Pro Arg 835 840 845 Phe Asn Lys Val Tyr Ser Ser Met GluArg Leu Ser Leu Leu Glu Glu 850 855 860 Arg Arg Thr Pro Pro Pro Thr LysArg Ser Leu Ser Glu Glu Lys Glu 865 870 875 880 Asp His Ser Asp Gly LeuAla Gly Leu Lys Gly Arg Asp Arg Ser Trp 885 890 895 Val Ile Gly Ser ProGlu Ile Leu Arg Lys Arg Leu Ser Val Ser Glu 900 905 910 Ser Ser His ThrGlu Ser Asp Ser Ser Pro Pro Met Thr Val Arg Arg 915 920 925 Arg Cys SerGly Leu Leu Asp Ala Pro Arg Phe Pro Glu Gly Pro Glu 930 935 940 Glu AlaSer Ser Thr Leu Arg Arg Gln Pro Gln Glu Gly Ile Trp Val 945 950 955 960Leu Thr Pro Pro Ser Gly Glu Gly Val Ser Gly Pro Val Thr Glu His 965 970975 Ser Gly Glu Gln Arg Pro Lys Leu Asp Glu Glu Ala Val Gly Arg Ser 980985 990 Ser Gly Ser Ser Pro Ala Met Glu Thr Arg Gly Arg Gly Thr Ser Gln995 1000 1005 Leu Ala Glu Gly Ala Thr Ala Lys Ala Ile Ser Asp Leu AlaVal 1010 1015 1020 Arg Arg Ala Arg His Arg Leu Leu Ser Gly Asp Ser ThrGlu Lys 1025 1030 1035 Arg Thr Ala Arg Pro Val Asn Lys Val Ile Lys SerAla Ser Ala 1040 1045 1050 Thr Ala Leu Ser Leu Leu Ile Pro Ser Glu HisHis Thr Cys Ser 1055 1060 1065 Pro Leu Ala Ser Pro Met Ser Pro His SerGln Ser Ser Asn Pro 1070 1075 1080 Ser Ser Arg Asp Ser Ser Pro Ser ArgAsp Phe Leu Pro Ala Leu 1085 1090 1095 Gly Ser Met Arg Pro Pro Ile IleIle His Arg Ala Gly Lys Lys 1100 1105 1110 Tyr Gly Phe Thr Leu Arg AlaIle Arg Val Tyr Met Gly Asp Ser 1115 1120 1125 Asp Val Tyr Thr Val HisHis Met Val Trp His Val Glu Asp Gly 1130 1135 1140 Gly Pro Ala Ser GluAla Gly Leu Arg Gln Gly Asp Leu Ile Thr 1145 1150 1155 His Val Asn GlyGlu Pro Val His Gly Leu Val His Thr Glu Val 1160 1165 1170 Val Glu LeuIle Leu Lys Ser Gly Asn Lys Val Ala Ile Ser Thr 1175 1180 1185 Thr ProLeu Glu Asn Thr Ser Ile Lys Val Gly Pro Ala Arg Lys 1190 1195 1200 GlySer Tyr Lys Ala Lys Met Ala Arg Arg Ser Lys Arg Ser Arg 1205 1210 1215Gly Lys Asp Gly Gln Glu Ser Arg Lys Arg Ser Ser Leu Phe Arg 1220 12251230 Lys Ile Thr Lys Gln Ala Ser Leu Leu His Thr Ser Arg Ser Leu 12351240 1245 Ser Ser Leu Asn Arg Ser Leu Ser Ser Gly Glu Ser Gly Pro Gly1250 1255 1260 Ser Pro Thr His Ser His Ser Leu Ser Pro Arg Ser Pro ThrGln 1265 1270 1275 Gly Tyr Arg Val Thr Pro Asp Ala Val His Ser Val GlyGly Asn 1280 1285 1290 Ser Ser Gln Ser Ser Ser Pro Ser Ser Ser Val ProSer Ser Pro 1295 1300 1305 Ala Gly Ser Gly His Thr Arg Pro Ser Ser LeuHis Gly Leu Ala 1310 1315 1320 Pro Lys Leu Gln Arg Gln Tyr Arg Ser ProArg Arg Lys Ser Ala 1325 1330 1335 Gly Ser Ile Pro Leu Ser Pro Leu AlaHis Thr Pro Ser Pro Pro 1340 1345 1350 Pro Pro Thr Ala Ser Pro Gln ArgSer Pro Ser Pro Leu Ser Gly 1355 1360 1365 His Val Ala Gln Ala Phe ProThr Lys Leu His Leu Ser Pro Pro 1370 1375 1380 Leu Gly Arg Gln Leu SerArg Pro Lys Ser Ala Glu Pro Pro Arg 1385 1390 1395 Ser Pro Leu Leu LysArg Val Gln Ser Ala Glu Lys Leu Ala Ala 1400 1405 1410 Ala Leu Ala AlaSer Glu Lys Lys Leu Ala Thr Ser Arg Lys His 1415 1420 1425 Ser Leu AspLeu Pro His Ser Glu Leu Lys Lys Glu Leu Pro Pro 1430 1435 1440 Arg GluVal Ser Pro Leu Glu Val Val Gly Ala Arg Ser Val Leu 1445 1450 1455 SerGly Lys Gly Ala Leu Pro Gly Lys Gly Val Leu Gln Pro Ala 1460 1465 1470Pro Ser Arg Ala Leu Gly Thr Leu Arg Gln Asp Arg Ala Glu Arg 1475 14801485 Arg Glu Ser Leu Gln Lys Gln Glu Ala Ile Arg Glu Val Asp Ser 14901495 1500 Ser Glu Asp Asp Thr Glu Glu Gly Pro Glu Asn Ser Gln Gly Ala1505 1510 1515 Gln Glu Leu Ser Leu Ala Pro His Pro Glu Val Ser Gln SerVal 1520 1525 1530 Ala Pro Lys Gly Ala Gly Glu Ser Gly Glu Glu Asp ProPhe Pro 1535 1540 1545 Ser Arg Gly Pro Arg Ser Leu Gly Pro Met Val ProSer Leu Leu 1550 1555 1560 Thr Gly Ile Thr Leu Gly Pro Pro Arg Met GluSer Pro Ser Gly 1565 1570 1575 Pro His Arg Arg Leu Gly Ser Pro Gln AlaIle Glu Glu Ala Ala 1580 1585 1590 Ser Ser Ser Ser Ala Gly Pro Asn LeuGly Gln Ser Gly Ala Thr 1595 1600 1605 Asp Pro Ile Pro Pro Glu Gly CysTrp Lys Ala Gln His Leu His 1610 1615 1620 Thr Gln Ala Leu Thr Ala LeuSer Pro Ser Thr Ser Gly Leu Thr 1625 1630 1635 Pro Thr Ser Ser Cys SerPro Pro Ser Ser Thr Ser Gly Lys Leu 1640 1645 1650 Ser Met Trp Ser TrpLys Ser Leu Ile Glu Gly Pro Asp Arg Ala 1655 1660 1665 Ser Pro Ser ArgLys Ala Thr Met Ala Gly Gly Leu Ala Asn Leu 1670 1675 1680 Gln Asp LeuGlu Thr Gln Leu Gln Pro Ser Leu Arg Thr Cys Leu 1685 1690 1695 Pro GlySer Arg Gly Arg His Ser His Leu Val Pro Pro Asp Trp 1700 1705 1710 ProIle His Leu Met Arg Ile Pro Ala Arg Ala Gly Tyr Gly Ser 1715 1720 1725Leu Ser Val His Lys Gln 1730 15 1063 PRT Homo sapiens 15 Met Gly His IleLys Leu Thr Asp Phe Gly Leu Ser Lys Met Gly Leu 1 5 10 15 Met Ser LeuThr Thr Asn Leu Tyr Glu Gly His Ile Glu Lys Asp Ala 20 25 30 Arg Glu PheLeu Asp Lys Gln Val Cys Gly Thr Pro Glu Tyr Ile Ala 35 40 45 Pro Glu ValIle Leu Arg Gln Gly Tyr Gly Lys Pro Val Asp Trp Trp 50 55 60 Ala Met GlyIle Ile Leu Tyr Glu Phe Leu Val Gly Cys Val Pro Phe 65 70 75 80 Phe GlyAsp Thr Pro Glu Glu Leu Phe Gly Gln Val Ile Ser Asp Asp 85 90 95 Ile LeuTrp Pro Glu Gly Asp Glu Ala Leu Pro Thr Glu Ala Gln Leu 100 105 110 LeuIle Ser Ser Leu Leu Gln Thr Asn Pro Leu Val Arg Leu Gly Ala 115 120 125Gly Gly Ala Phe Glu Val Lys Gln His Ser Phe Phe Arg Asp Leu Asp 130 135140 Trp Thr Gly Leu Leu Arg Gln Lys Ala Glu Phe Ile Pro His Leu Glu 145150 155 160 Ser Glu Asp Asp Thr Ser Tyr Phe Asp Thr Arg Ser Asp Arg TyrHis 165 170 175 His Val Asn Ser Tyr Asp Glu Asp Asp Thr Thr Glu Glu GluPro Val 180 185 190 Glu Ile Arg Gln Phe Ser Ser Cys Ser Pro Arg Phe SerLys Val Tyr 195 200 205 Ser Ser Met Glu Gln Leu Ser Gln His Glu Pro LysThr Pro Val Ala 210 215 220 Ala Ala Gly Ser Ser Lys Arg Glu Pro Ser ThrLys Gly Pro Glu Glu 225 230 235 240 Lys Val Ala Gly Lys Arg Glu Gly LeuGly Gly Leu Thr Leu Arg Glu 245 250 255 Lys Thr Trp Arg Gly Gly Ser ProGlu Ile Lys Arg Phe Ser Ala Ser 260 265 270 Glu Ala Ser Phe Leu Glu GlyGlu Ala Ser Pro Pro Leu Gly Ala Arg 275 280 285 Arg Arg Phe Ser Ala LeuLeu Glu Pro Ser Arg Phe Ser Ala Pro Gln 290 295 300 Glu Asp Glu Asp GluAla Arg Leu Arg Arg Pro Pro Arg Pro Ser Ser 305 310 315 320 Asp Pro AlaGly Ser Leu Asp Ala Arg Ala Pro Lys Glu Glu Thr Gln 325 330 335 Gly GluGly Thr Ser Ser Ala Gly Asp Ser Glu Ala Thr Asp Arg Pro 340 345 350 ArgPro Gly Asp Leu Cys Pro Pro Ser Lys Asp Gly Asp Ala Ser Gly 355 360 365Pro Arg Ala Thr Asn Asp Leu Val Leu Arg Arg Ala Arg His Gln Gln 370 375380 Met Ser Gly Asp Val Ala Val Glu Lys Arg Pro Ser Arg Thr Gly Gly 385390 395 400 Lys Val Ile Lys Ser Ala Ser Ala Thr Ala Leu Ser Val Met IlePro 405 410 415 Ala Val Asp Pro His Gly Ser Ser Pro Leu Ala Ser Pro MetSer Pro 420 425 430 Arg Ser Leu Ser Ser Asn Pro Ser Ser Arg Asp Ser SerPro Ser Arg 435 440 445 Asp Tyr Ser Pro Ala Val Ser Gly Leu Arg Ser ProIle Thr Ile Gln 450 455 460 Arg Ser Gly Lys Lys Tyr Gly Phe Thr Leu ArgAla Ile Arg Val Tyr 465 470 475 480 Met Gly Asp Thr Asp Val Tyr Ser ValHis His Ile Val Trp His Val 485 490 495 Glu Glu Gly Gly Pro Ala Gln GluAla Gly Leu Cys Ala Gly Asp Leu 500 505 510 Ile Thr His Val Asn Gly GluPro Val His Gly Met Val His Pro Glu 515 520 525 Val Val Glu Leu Ile LeuLys Ser Gly Asn Lys Val Ala Val Thr Thr 530 535 540 Thr Pro Phe Glu AsnThr Ser Ile Arg Ile Gly Pro Ala Arg Arg Ser 545 550 555 560 Ser Tyr LysAla Lys Met Ala Arg Arg Asn Lys Arg Pro Ser Ala Lys 565 570 575 Glu GlyGln Glu Ser Lys Lys Arg Ser Ser Leu Phe Arg Lys Ile Thr 580 585 590 LysGln Ser Asn Leu Leu His Thr Ser Arg Ser Leu Ser Ser Leu Asn 595 600 605Arg Ser Leu Ser Ser Ser Asp Ser Leu Pro Gly Ser Pro Thr His Gly 610 615620 Leu Pro Ala Arg Ser Pro Thr His Ser Tyr Arg Ser Thr Pro Asp Ser 625630 635 640 Ala Tyr Leu Gly Ala Ser Ser Gln Ser Ser Ser Pro Ala Ser SerThr 645 650 655 Pro Asn Ser Pro Ala Ser Ser Ala Ser His His Ile Arg ProSer Thr 660 665 670 Leu His Gly Leu Ser Pro Lys Leu His Arg Gln Tyr ArgSer Ala Arg 675 680 685 Cys Lys Ser Ala Gly Asn Ile Pro Leu Ser Pro LeuAla His Thr Pro 690 695 700 Ser Pro Thr Gln Ala Ser Pro Pro Pro Leu ProGly His Thr Val Gly 705 710 715 720 Ser Ser His Thr Thr Gln Ser Phe ProAla Lys Leu His Ser Ser Pro 725 730 735 Pro Val Val Arg Pro Arg Pro LysSer Ala Glu Pro Pro Arg Ser Pro 740 745 750 Leu Leu Lys Arg Val Gln SerAla Glu Lys Leu Gly Ala Ser Leu Ser 755 760 765 Ala Asp Lys Lys Gly AlaLeu Arg Lys His Ser Leu Glu Val Gly His 770 775 780 Pro Asp Phe Arg LysAsp Phe His Gly Glu Leu Ala Leu His Ser Leu 785 790 795 800 Ala Glu SerAsp Gly Glu Thr Pro Pro Val Glu Gly Leu Gly Ala Pro 805 810 815 Arg GlnVal Ala Val Arg Arg Leu Gly Arg Gln Glu Ser Pro Leu Ser 820 825 830 LeuGly Ala Asp Pro Leu Leu Pro Glu Gly Ala Ser Arg Pro Pro Val 835 840 845Ser Ser Lys Glu Lys Glu Ser Pro Gly Gly Ala Glu Ala Cys Thr Pro 850 855860 Pro Arg Ala Thr Thr Pro Gly Gly Arg Thr Leu Glu Arg Asp Val Gly 865870 875 880 Cys Thr Arg His Gln Ser Val Gln Thr Glu Asp Gly Thr Gly GlyMet 885 890 895 Ala Arg Ala Val Ala Lys Ala Ala Leu Ser Pro Val Gln GluHis Glu 900 905 910 Thr Gly Arg Arg Ser Ser Ser Gly Glu Ala Gly Thr ProLeu Val Pro 915 920 925 Ile Val Val Glu Pro Ala Arg Pro Gly Ala Lys AlaVal Val Pro Gln 930 935 940 Pro Leu Gly Ala Asp Ser Lys Gly Leu Gln GluPro Ala Pro Leu Ala 945 950 955 960 Pro Ser Val Pro Glu Ala Pro Arg GlyArg Glu Arg Trp Val Leu Glu 965 970 975 Val Val Glu Glu Arg Thr Thr LeuSer Gly Pro Arg Ser Lys Pro Ala 980 985 990 Ser Pro Lys Leu Ser Pro GluPro Gln Thr Pro Ser Leu Ala Pro Ala 995 1000 1005 Lys Cys Ser Ala ProSer Ser Ala Val Thr Pro Val Pro Pro Ala 1010 1015 1020 Ser Leu Leu GlySer Gly Thr Lys Pro Gln Val Gly Leu Thr Ser 1025 1030 1035 Arg Cys ProAla Glu Ala Val Pro Pro Ala Gly Leu Thr Lys Lys 1040 1045 1050 Gly ValSer Ser Pro Ala Pro Pro Gly Pro 1055 1060 16 1139 PRT Homo sapiensmisc_feature (422)..(431) Xaa can be any naturally occurring amino acid16 Met Met Asn His Val Tyr Arg Glu Arg Phe Pro Lys Ala Thr Ala Gln 1 510 15 Met Glu Gly Arg Leu Gln Glu Phe Leu Thr Ala Tyr Ala Pro Gly Ala 2025 30 Arg Leu Ala Leu Ala Asp Gly Val Leu Gly Phe Ile His His Gln Ile 3540 45 Val Glu Leu Ala Arg Asp Cys Leu Ala Lys Ser Gly Glu Asn Leu Val 5055 60 Thr Ser Arg Tyr Phe Leu Glu Met Gln Glu Lys Leu Glu Arg Leu Leu 6570 75 80 Gln Asp Ala His Glu Arg Ser Asp Ser Glu Glu Val Ser Phe Ile Val85 90 95 Gln Leu Val Arg Lys Leu Leu Ile Ile Ile Ser Arg Pro Ala Arg Leu100 105 110 Leu Glu Cys Leu Glu Phe Asp Pro Glu Glu Phe Tyr His Leu LeuGlu 115 120 125 Ala Ala Glu Gly His Ala Arg Glu Gly Gln Gly Ile Lys ThrAsp Leu 130 135 140 Pro Gln Tyr Ile Ile Gly Gln Leu Gly Leu Ala Lys AspPro Leu Glu 145 150 155 160 Glu Met Val Pro Leu Ser His Leu Glu Glu GluGln Pro Pro Ala Pro 165 170 175 Glu Ser Pro Glu Ser Arg Ala Leu Val GlyGln Ser Arg Arg Lys Pro 180 185 190 Cys Glu Ser Asp Phe Glu Thr Ile LysLeu Ile Ser Asn Gly Ala Tyr 195 200 205 Gly Ala Val Tyr Leu Val Arg HisArg Asp Thr Arg Gln Arg Phe Ala 210 215 220 Ile Lys Lys Ile Asn Lys GlnAsn Leu Ile Leu Arg Asn Gln Ile Gln 225 230 235 240 Gln Val Phe Val GluArg Asp Ile Leu Thr Phe Ala Glu Asn Pro Phe 245 250 255 Val Val Ser MetPhe Cys Ser Phe Glu Thr Arg Arg His Leu Cys Met 260 265 270 Val Met GluTyr Val Glu Gly Gly Asp Cys Ala Thr Leu Leu Lys Asn 275 280 285 Met GlyPro Leu Pro Val Asp Met Ala Arg Leu Tyr Phe Ala Glu Thr 290 295 300 ValLeu Ala Leu Glu Tyr Leu His Asn Tyr Gly Ile Val His Arg Asp 305 310 315320 Leu Lys Pro Asp Asn Leu Leu Ile Thr Ser Leu Gly His Ile Lys Leu 325330 335 Thr Asp Phe Gly Leu Ser Lys Ile Gly Leu Met Ser Met Ala Thr Asn340 345 350 Leu Tyr Glu Gly His Ile Glu Lys Asp Ala Arg Glu Phe Ile AspLys 355 360 365 Gln Val Cys Gly Thr Pro Glu Tyr Ile Ala Pro Glu Val IlePhe Arg 370 375 380 Gln Gly Tyr Gly Lys Pro Val Asp Trp Trp Ala Met GlyVal Val Leu 385 390 395 400 Tyr Glu Phe Leu Val Gly Cys Val Pro Phe PheGly Asp Thr Pro Glu 405 410 415 Glu Leu Phe Gly Gln Xaa Xaa Xaa Xaa XaaXaa Xaa Xaa Xaa Xaa Gly 420 425 430 Asp Glu Ala Leu Pro Ala Asp Ala GlnAsp Leu Ile Thr Arg Leu Leu 435 440 445 Arg Gln Ser Pro Leu Asp Arg LeuGly Thr Gly Gly Thr His Glu Val 450 455 460 Lys Gln His Pro Phe Phe LeuAla Leu Asp Trp Ala Gly Leu Leu Arg 465 470 475 480 His Lys Ala Glu PheVal Pro Gln Leu Glu Ala Glu Asp Asp Thr Ser 485 490 495 Tyr Phe Asp ThrArg Ser Glu Arg Tyr Arg His Leu Gly Ser Glu Asp 500 505 510 Asp Glu ThrAsn Asp Glu Glu Ser Ser Thr Glu Ile Pro Gln Phe Ser 515 520 525 Ser CysSer His Arg Phe Ser Lys Val Tyr Ser Ser Ser Glu Phe Leu 530 535 540 AlaVal Gln Pro Thr Pro Thr Phe Ala Glu Arg Ser Phe Ser Glu Asp 545 550 555560 Arg Glu Glu Gly Trp Glu Arg Ser Glu Val Asp Tyr Gly Arg Arg Leu 565570 575 Ser Ala Asp Ile Arg Leu Arg Ser Trp Thr Ser Ser Gly Ser Ser Cys580 585 590 Gln Ser Ser Ser Ser Gln Pro Glu Arg Gly Pro Ser Pro Ser LeuLeu 595 600 605 Asn Thr Ile Ser Leu Asp Thr Met Pro Lys Phe Ala Phe SerSer Glu 610 615 620 Asp Glu Gly Val Gly Pro Gly Pro Ala Gly Pro Lys ArgPro Val Phe 625 630 635 640 Ile Leu Gly Glu Pro Asp Pro Pro Pro Ala AlaThr Pro Val Met Pro 645 650 655 Lys Pro Ser Ser Leu Ser Ala Asp Thr AlaAla Leu Ser His Ala Arg 660 665 670 Leu Arg Ser Asn Ser Ile Gly Ala ArgHis Ser Thr Pro Arg Pro Leu 675 680 685 Asp Ala Gly Arg Gly Arg Arg LeuGly Gly Pro Arg Asp Pro Ala Pro 690 695 700 Glu Lys Ser Arg Ala Ser SerSer Gly Gly Ser Gly Gly Gly Ser Gly 705 710 715 720 Gly Arg Val Pro LysSer Ala Ser Val Ser Ala Leu Ser Leu Ile Ile 725 730 735 Thr Ala Asp AspGly Ser Gly Gly Pro Leu Met Ser Pro Leu Ser Pro 740 745 750 Arg Ser LeuSer Ser Asn Pro Ser Ser Arg Asp Ser Ser Pro Ser Arg 755 760 765 Asp ProSer Pro Val Cys Gly Ser Leu Arg Pro Pro Ile Val Ile His 770 775 780 SerSer Gly Lys Lys Tyr Gly Phe Ser Leu Arg Ala Ile Arg Val Tyr 785 790 795800 Met Gly Asp Ser Asp Val Tyr Thr Val His His Val Val Trp Ser Val 805810 815 Glu Asp Gly Ser Pro Ala Gln Glu Ala Gly Leu Arg Ala Gly Asp Leu820 825 830 Ile Thr His Ile Asn Gly Glu Ser Val Leu Gly Leu Val His MetAsp 835 840 845 Val Val Glu Leu Leu Leu Lys Ser Gly Asn Lys Ile Ser LeuArg Thr 850 855 860 Thr Ala Leu Glu Asn Thr Ser Ile Lys Val Gly Pro AlaArg Lys Asn 865 870 875 880 Val Ala Lys Gly Arg Met Ala Arg Arg Ser LysArg Ser Arg Arg Arg 885 890 895 Glu Thr Gln Asp Arg Arg Lys Ser Leu PheLys Lys Ile Ser Lys Gln 900 905 910 Thr Ser Val Leu His Thr Ser Arg SerPhe Ser Ser Gly Leu His His 915 920 925 Ser Leu Ser Ser Ser Glu Ser LeuPro Gly Ser Pro Thr His Ser Leu 930 935 940 Ser Pro Ser Pro Thr Thr ProCys Arg Ser Pro Ala Pro Asp Val Pro 945 950 955 960 Ala Asp Thr Thr AlaSer Pro Pro Ser Ala Ser Pro Ser Ser Ser Ser 965 970 975 Pro Ala Ser ProAla Ala Ala Gly His Thr Arg Pro Ser Ser Leu His 980 985 990 Gly Leu AlaAla Lys Leu Gly Pro Pro Arg Pro Lys Thr Gly Arg Arg 995 1000 1005 LysSer Thr Ser Ser Ile Pro Pro Ser Pro Leu Ala Cys Pro Pro 1010 1015 1020Ile Ser Ala Pro Pro Pro Arg Ser Pro Ser Pro Leu Pro Gly His 1025 10301035 Pro Pro Ala Pro Ala Arg Ser Pro Arg Leu Arg Arg Gly Gln Ser 10401045 1050 Ala Asp Lys Leu Gly Thr Gly Glu Arg Leu Asp Gly Glu Ala Gly1055 1060 1065 Arg Arg Thr Arg Gly Pro Glu Ala Glu Leu Val Val Met ArgArg 1070 1075 1080 Leu His Leu Ser Glu Arg Arg Asp Ser Phe Lys Lys GlnGlu Ala 1085 1090 1095 Val Gln Glu Val Ser Phe Asp Glu Pro Gln Glu GluAla Thr Gly 1100 1105 1110 Leu Pro Thr Ser Val Pro Gln Ile Ala Val GluGly Glu Glu Ala 1115 1120 1125 Val Pro Val Ala Leu Gly Pro Thr Gly ArgAsp 1130 1135 17 2429 PRT Homo sapiens 17 Met Asp Met Ser Asp Pro AsnPhe Trp Thr Val Leu Ser Asn Phe Thr 1 5 10 15 Leu Pro His Leu Arg SerGly Asn Arg Leu Arg Arg Thr Gln Ser Cys 20 25 30 Arg Thr Ser Asn Arg LysSer Leu Ile Gly Asn Gly Gln Ser Pro Ala 35 40 45 Leu Pro Arg Pro His SerPro Leu Ser Ala His Ala Gly Asn Ser Pro 50 55 60 Gln Asp Ser Pro Arg AsnPhe Ser Pro Ser Ala Ser Ala His Phe Ser 65 70 75 80 Phe Ala Arg Arg ThrAsp Gly Arg Arg Trp Ser Leu Ala Ser Leu Pro 85 90 95 Ser Ser Gly Tyr GlyThr Asn Thr Pro Ser Ser Thr Val Ser Ser Ser 100 105 110 Cys Ser Ser GlnGlu Lys Leu His Gln Leu Pro Tyr Gln Pro Thr Pro 115 120 125 Asp Glu LeuHis Phe Leu Ser Lys His Phe Cys Thr Thr Glu Ser Ile 130 135 140 Ala ThrGlu Asn Arg Cys Arg Asn Thr Pro Met Arg Pro Arg Ser Arg 145 150 155 160Ser Leu Ser Pro Gly Arg Ser Pro Ala Cys Cys Asp His Glu Ile Ile 165 170175 Met Met Asn His Val Tyr Lys Glu Arg Phe Pro Lys Ala Thr Ala Gln 180185 190 Met Glu Glu Arg Leu Lys Glu Ile Ile Thr Ser Tyr Ser Pro Asp Asn195 200 205 Val Leu Pro Leu Ala Asp Gly Val Leu Ser Phe Thr His His GlnIle 210 215 220 Ile Glu Leu Ala Arg Asp Cys Leu Asp Lys Ser His Gln GlyLeu Ile 225 230 235 240 Thr Ser Arg Tyr Phe Leu Glu Leu Gln His Lys LeuAsp Lys Leu Leu 245 250 255 Gln Glu Ala His Asp Arg Ser Glu Ser Gly GluLeu Ala Phe Ile Lys 260 265 270 Gln Leu Val Arg Lys Ile Leu Ile Val IleAla Arg Pro Ala Arg Leu 275 280 285 Leu Glu Cys Leu Glu Phe Asp Pro GluGlu Phe Tyr Tyr Leu Leu Glu 290 295 300 Ala Ala Glu Gly His Ala Lys GluGly Gln Gly Ile Lys Thr Asp Ile 305 310 315 320 Pro Arg Tyr Ile Ile SerGln Leu Gly Leu Asn Lys Asp Pro Leu Glu 325 330 335 Glu Met Ala His LeuGly Asn Tyr Asp Ser Gly Thr Ala Glu Thr Pro 340 345 350 Glu Thr Asp GluSer Val Ser Ser Ser Asn Ala Ser Leu Lys Leu Arg 355 360 365 Arg Lys ProArg Glu Ser Asp Phe Glu Thr Ile Lys Leu Ile Ser Asn 370 375 380 Gly AlaTyr Gly Ala Val Tyr Phe Val Arg His Lys Glu Ser Arg Gln 385 390 395 400Arg Phe Ala Met Lys Lys Ile Asn Lys Gln Asn Leu Ile Leu Arg Asn 405 410415 Gln Ile Gln Gln Ala Phe Val Glu Arg Asp Ile Leu Thr Phe Ala Glu 420425 430 Asn Pro Phe Val Val Ser Met Tyr Cys Ser Phe Glu Thr Arg Arg His435 440 445 Leu Cys Met Val Met Glu Tyr Val Glu Gly Gly Asp Cys Ala ThrLeu 450 455 460 Met Lys Asn Met Gly Pro Leu Pro Val Asp Met Ala Arg MetTyr Phe 465 470 475 480 Ala Glu Thr Val Leu Ala Leu Glu Tyr Leu His AsnTyr Gly Ile Val 485 490 495 His Arg Asp Leu Lys Pro Asp Asn Leu Leu ValThr Ser Met Gly His 500 505 510 Ile Lys Leu Thr Asp Phe Gly Leu Ser LysVal Gly Leu Met Ser Met 515 520 525 Thr Thr Asn Leu Tyr Glu Gly His IleGlu Lys Asp Ala Arg Glu Phe 530 535 540 Leu Asp Lys Gln Val Cys Gly ThrPro Glu Tyr Ile Ala Pro Glu Val 545 550 555 560 Ile Leu Arg Gln Gly TyrGly Lys Pro Val Asp Trp Trp Ala Met Gly 565 570 575 Ile Ile Leu Tyr GluPhe Leu Val Gly Cys Val Pro Phe Phe Gly Asp 580 585 590 Thr Pro Glu GluLeu Phe Gly Gln Val Ile Ser Asp Glu Ile Asn Trp 595 600 605 Pro Glu LysAsp Glu Ala Pro Pro Pro Asp Ala Gln Asp Leu Ile Thr 610 615 620 Leu LeuLeu Arg Gln Asn Pro Leu Glu Arg Leu Gly Thr Gly Gly Ala 625 630 635 640Tyr Glu Val Lys Gln His Arg Phe Phe Arg Ser Leu Asp Trp Asn Ser 645 650655 Leu Leu Arg Gln Lys Ala Glu Phe Ile Pro Gln Leu Glu Ser Glu Asp 660665 670 Asp Thr Ser Tyr Phe Asp Thr Arg Ser Glu Lys Tyr His His Met Glu675 680 685 Thr Glu Glu Glu Asp Asp Thr Asn Asp Glu Asp Phe Asn Val GluIle 690 695 700 Arg Gln Phe Ser Ser Cys Ser His Arg Phe Ser Lys Val PheSer Ser 705 710 715 720 Ile Asp Arg Ile Thr Gln Asn Ser Ala Glu Glu LysGlu Asp Ser Val 725 730 735 Asp Lys Thr Lys Ser Thr Thr Leu Pro Ser ThrGlu Thr Leu Ser Trp 740 745 750 Ser Ser Glu Tyr Ser Glu Met Gln Gln LeuSer Thr Ser Asn Ser Ser 755 760 765 Asp Thr Glu Ser Asn Arg His Lys LeuSer Ser Gly Leu Leu Pro Lys 770 775 780 Leu Ala Ile Ser Thr Glu Gly GluGln Asp Glu Ala Ala Ser Cys Pro 785 790 795 800 Gly Asp Pro His Glu GluPro Gly Lys Pro Ala Leu Pro Pro Glu Glu 805 810 815 Cys Ala Gln Glu GluPro Glu Val Thr Thr Pro Ala Ser Thr Ile Ser 820 825 830 Ser Ser Thr LeuSer Val Gly Ser Phe Ser Glu His Leu Asp Gln Ile 835 840 845 Asn Gly ArgSer Glu Cys Val Asp Ser Thr Asp Asn Ser Ser Lys Pro 850 855 860 Ser SerGlu Pro Ala Ser His Met Ala Arg Gln Arg Leu Glu Ser Thr 865 870 875 880Glu Lys Lys Lys Ile Ser Gly Lys Val Thr Lys Ser Leu Ser Ala Ser 885 890895 Ala Leu Ser Leu Met Ile Pro Gly Asp Met Phe Ala Val Ser Pro Leu 900905 910 Gly Ser Pro Met Ser Pro His Ser Leu Ser Ser Asp Pro Ser Ser Ser915 920 925 Arg Asp Ser Ser Pro Ser Arg Asp Ser Ser Ala Ala Ser Ala SerPro 930 935 940 His Gln Pro Ile Val Ile His Ser Ser Gly Lys Asn Tyr GlyPhe Thr 945 950 955 960 Ile Arg Ala Ile Arg Val Tyr Val Gly Asp Ser AspIle Tyr Thr Val 965 970 975 His His Ile Val Trp Asn Val Glu Glu Gly SerPro Ala Cys Gln Ala 980 985 990 Gly Leu Lys Ala Gly Asp Leu Ile Thr HisIle Asn Gly Glu Pro Val 995 1000 1005 His Gly Leu Val His Thr Glu ValIle Glu Leu Leu Leu Lys Ser 1010 1015 1020 Gly Asn Lys Val Ser Ile ThrThr Thr Pro Phe Glu Asn Thr Ser 1025 1030 1035 Ile Lys Thr Gly Pro AlaArg Arg Asn Ser Tyr Lys Ser Arg Met 1040 1045 1050 Val Arg Arg Ser LysLys Ser Lys Lys Lys Glu Ser Leu Glu Arg 1055 1060 1065 Arg Arg Ser LeuPhe Lys Lys Leu Ala Lys Gln Pro Ser Pro Leu 1070 1075 1080 Leu His ThrSer Arg Ser Phe Ser Cys Leu Asn Arg Ser Leu Ser 1085 1090 1095 Ser GlyGlu Ser Leu Pro Gly Ser Pro Thr His Ser Leu Ser Pro 1100 1105 1110 ArgSer Pro Thr Pro Ser Tyr Arg Ser Thr Pro Asp Phe Pro Ser 1115 1120 1125Gly Thr Asn Ser Ser Gln Ser Ser Ser Pro Ser Ser Ser Ala Pro 1130 11351140 Asn Ser Pro Ala Gly Ser Gly His Ile Arg Pro Ser Thr Leu His 11451150 1155 Gly Leu Ala Pro Lys Leu Gly Gly Gln Arg Tyr Arg Ser Gly Arg1160 1165 1170 Arg Lys Ser Ala Gly Asn Ile Pro Leu Ser Pro Leu Ala ArgThr 1175 1180 1185 Pro Ser Pro Thr Pro Gln Pro Thr Ser Pro Gln Arg SerPro Ser 1190 1195 1200 Pro Leu Leu Gly His Ser Leu Gly Asn Ser Lys IleAla Gln Ala 1205 1210 1215 Phe Pro Ser Lys Met His Ser Pro Pro Thr IleVal Arg His Ile 1220 1225 1230 Val Arg Pro Lys Ser Ala Glu Pro Pro ArgSer Pro Leu Leu Lys 1235 1240 1245 Arg Val Gln Ser Glu Glu Lys Leu SerPro Ser Tyr Gly Ser Asp 1250 1255 1260 Lys Lys His Leu Cys Ser Arg LysHis Ser Leu Glu Val Thr Gln 1265 1270 1275 Glu Glu Val Gln Arg Glu GlnSer Gln Arg Glu Ala Pro Leu Gln 1280 1285 1290 Ser Leu Asp Glu Asn ValCys Asp Val Pro Pro Leu Ser Arg Ala 1295 1300 1305 Arg Pro Val Glu GlnGly Cys Leu Lys Arg Pro Val Ser Arg Lys 1310 1315 1320 Val Gly Arg GlnGlu Ser Val Asp Asp Leu Asp Arg Asp Lys Leu 1325 1330 1335 Lys Ala LysVal Val Val Lys Lys Ala Asp Gly Phe Pro Glu Lys 1340 1345 1350 Gln GluSer His Gln Lys Ser His Gly Pro Gly Ser Asp Leu Glu 1355 1360 1365 AsnPhe Ala Leu Phe Lys Leu Glu Glu Arg Glu Lys Lys Val Tyr 1370 1375 1380Pro Lys Ala Val Glu Arg Ser Ser Thr Phe Glu Asn Lys Ala Ser 1385 13901395 Met Gln Glu Ala Pro Pro Leu Gly Ser Leu Leu Lys Asp Ala Leu 14001405 1410 His Lys Gln Ala Ser Val Arg Ala Ser Glu Gly Ala Met Ser Asp1415 1420 1425 Gly Pro Val Pro Ala Glu His Arg Gln Gly Gly Gly Asp PheArg 1430 1435 1440 Arg Ala Pro Ala Pro Gly Thr Leu Gln Asp Gly Leu CysHis Ser 1445 1450 1455 Leu Asp Arg Gly Ile Ser Gly Lys Gly Glu Gly ThrGlu Lys Ser 1460 1465 1470 Ser Gln Ala Lys Glu Leu Leu Arg Cys Glu LysLeu Asp Ser Lys 1475 1480 1485 Leu Ala Asn Ile Asp Tyr Leu Arg Lys LysMet Ser Leu Glu Asp 1490 1495 1500 Lys Glu Asp Asn Leu Cys Pro Val LeuLys Pro Lys Met Thr Ala 1505 1510 1515 Gly Ser His Glu Cys Leu Pro GlyAsn Pro Val Arg Pro Thr Gly 1520 1525 1530 Gly Gln Gln Glu Pro Pro ProAla Ser Glu Ser Arg Ala Phe Val 1535 1540 1545 Ser Ser Thr His Ala AlaGln Met Ser Ala Val Ser Phe Val Pro 1550 1555 1560 Leu Lys Ala Leu ThrGly Arg Val Asp Ser Gly Thr Glu Lys Pro 1565 1570 1575 Gly Leu Val AlaPro Glu Ser Pro Val Arg Lys Ser Pro Ser Glu 1580 1585 1590 Tyr Lys LeuGlu Gly Arg Ser Val Ser Cys Leu Lys Pro Ile Glu 1595 1600 1605 Gly ThrLeu Asp Ile Ala Leu Leu Ser Gly Pro Gln Ala Ser Lys 1610 1615 1620 ThrGlu Leu Pro Ser Pro Glu Ser Ala Gln Ser Pro Ser Pro Ser 1625 1630 1635Gly Asp Val Arg Ala Ser Val Pro Pro Val Leu Pro Ser Ser Ser 1640 16451650 Gly Lys Lys Asn Asp Thr Thr Ser Ala Arg Glu Leu Ser Pro Ser 16551660 1665 Ser Leu Lys Met Asn Lys Ser Tyr Leu Leu Glu Pro Trp Phe Leu1670 1675 1680 Pro Pro Ser Arg Gly Leu Gln Asn Ser Pro Ala Val Ser LeuPro 1685 1690 1695 Asp Pro Glu Phe Lys Arg Asp Arg Lys Gly Pro His ProThr Ala 1700 1705 1710 Arg Ser Pro Gly Thr Val Met Glu Ser Asn Pro GlnGln Arg Glu 1715 1720 1725 Gly Ser Ser Pro Lys His Gln Asp His Thr ThrAsp Pro Lys Leu 1730 1735 1740 Leu Thr Cys Leu Gly Gln Asn Leu His SerPro Asp Leu Ala Arg 1745 1750 1755 Pro Arg Cys Pro Leu Pro Pro Glu AlaSer Pro Ser Arg Glu Lys 1760 1765 1770 Pro Gly Leu Arg Glu Ser Ser GluArg Gly Pro Pro Thr Ala Arg 1775 1780 1785 Ser Glu Arg Ser Ala Ala ArgAla Asp Thr Cys Arg Glu Pro Ser 1790 1795 1800 Met Glu Leu Cys Phe ProGlu Thr Ala Lys Thr Ser Asp Asn Ser 1805 1810 1815 Lys Asn Leu Leu SerVal Gly Arg Thr His Pro Asp Phe Tyr Thr 1820 1825 1830 Gln Thr Gln AlaMet Glu Lys Ala Trp Ala Pro Gly Gly Lys Thr 1835 1840 1845 Asn His LysAsp Gly Pro Gly Glu Ala Arg Pro Pro Pro Arg Asp 1850 1855 1860 Asn SerSer Leu His Ser Ala Gly Ile Pro Cys Glu Lys Glu Leu 1865 1870 1875 GlyLys Val Arg Arg Gly Val Glu Pro Lys Pro Glu Ala Leu Leu 1880 1885 1890Ala Arg Arg Ser Leu Gln Pro Pro Gly Ile Glu Ser Glu Lys Ser 1895 19001905 Glu Lys Leu Ser Ser Phe Pro Ser Leu Gln Lys Asp Gly Ala Lys 19101915 1920 Glu Pro Glu Arg Lys Glu Gln Pro Leu Gln Arg His Pro Ser Ser1925 1930 1935 Ile Pro Pro Pro Pro Leu Thr Ala Lys Asp Leu Ser Ser ProAla 1940 1945 1950 Ala Arg Gln His Cys Ser Ser Pro Ser His Ala Ser GlyArg Glu 1955 1960 1965 Pro Gly Ala Lys Pro Ser Thr Ala Glu Pro Ser SerSer Pro Gln 1970 1975 1980 Asp Pro Pro Lys Pro Val Ala Ala His Ser GluSer Ser Ser His 1985 1990 1995 Lys Pro Arg Pro Gly Pro Asp Pro Gly ProPro Lys Thr Lys His 2000 2005 2010 Pro Asp Arg Ser Leu Ser Ser Gln LysPro Ser Val Gly Ala Thr 2015 2020 2025 Lys Gly Lys Glu Pro Ala Thr GlnSer Leu Gly Gly Ser Ser Arg 2030 2035 2040 Glu Gly Lys Gly His Ser LysSer Gly Pro Asp Val Phe Pro Ala 2045 2050 2055 Thr Pro Gly Ser Gln AsnLys Ala Ser Asp Gly Ile Gly Gln Gly 2060 2065 2070 Glu Gly Gly Pro SerVal Pro Leu His Thr Asp Arg Ala Pro Leu 2075 2080 2085 Asp Ala Lys ProGln Pro Thr Ser Gly Gly Arg Pro Leu Glu Val 2090 2095 2100 Leu Glu LysPro Val His Leu Pro Arg Pro Gly His Pro Gly Pro 2105 2110 2115 Ser GluPro Ala Asp Gln Lys Leu Ser Ala Val Gly Glu Lys Gln 2120 2125 2130 ThrLeu Ser Pro Lys His Pro Lys Pro Ser Thr Val Lys Asp Cys 2135 2140 2145Pro Thr Leu Cys Lys Gln Thr Asp Asn Arg Gln Thr Asp Lys Ser 2150 21552160 Pro Ser Gln Pro Ala Ala Asn Thr Asp Arg Arg Ala Glu Gly Lys 21652170 2175 Lys Cys Thr Glu Ala Leu Tyr Ala Pro Ala Glu Gly Asp Lys Leu2180 2185 2190 Glu Ala Gly Leu Ser Phe Val His Ser Glu Asn Arg Leu LysGly 2195 2200 2205 Ala Glu Arg Pro Ala Ala Gly Val Gly Lys Gly Phe ProGlu Ala 2210 2215 2220 Arg Gly Lys Gly Pro Gly Pro Gln Lys Pro Pro ThrGlu Ala Asp 2225 2230 2235 Lys Pro Asn Gly Met Lys Arg Ser Pro Ser AlaThr Gly Gln Ser 2240 2245 2250 Ser Phe Arg Ser Thr Ala Leu Pro Glu LysSer Leu Ser Cys Ser 2255 2260 2265 Ser Ser Phe Pro Glu Thr Arg Ala GlyVal Arg Glu Ala Ser Ala 2270 2275 2280 Ala Ser Ser Asp Thr Ser Ser AlaLys Ala Ala Gly Gly Met Leu 2285 2290 2295 Glu Leu Pro Ala Pro Ser AsnArg Asp His Arg Lys Ala Gln Pro 2300 2305 2310 Ala Gly Glu Gly Arg ThrHis Met Thr Lys Ser Asp Ser Leu Pro 2315 2320 2325 Ser Phe Arg Val SerThr Leu Pro Leu Glu Ser His His Pro Asp 2330 2335 2340 Pro Asn Thr MetGly Gly Ala Ser His Arg Asp Arg Ala Leu Ser 2345 2350 2355 Val Thr AlaThr Val Gly Glu Thr Lys Gly Lys Asp Pro Ala Pro 2360 2365 2370 Ala GlnPro Pro Pro Ala Arg Lys Gln Asn Val Gly Arg Asp Val 2375 2380 2385 ThrLys Pro Ser Pro Ala Pro Asn Thr Asp Arg Pro Ile Ser Leu 2390 2395 2400Ser Asn Glu Lys Asp Phe Val Val Arg Gln Arg Arg Gly Lys Glu 2405 24102415 Ser Leu Arg Ser Ser Pro His Lys Lys Ala Leu 2420 2425 18 2092 PRTHomo sapiens 18 Met Ala His Leu Gly Asn Tyr Asp Ser Gly Thr Ala Glu ThrPro Glu 1 5 10 15 Thr Asp Glu Ser Val Ser Ser Ser Asn Ala Ser Leu LysLeu Arg Arg 20 25 30 Lys Pro Arg Glu Ser Asp Phe Glu Thr Ile Lys Leu IleSer Asn Gly 35 40 45 Ala Tyr Gly Ala Val Tyr Phe Val Arg His Lys Glu SerArg Gln Arg 50 55 60 Phe Ala Met Lys Lys Ile Asn Lys Gln Asn Leu Ile LeuArg Asn Gln 65 70 75 80 Ile Gln Gln Ala Phe Val Glu Arg Asp Ile Leu ThrPhe Ala Glu Asn 85 90 95 Pro Phe Val Val Ser Met Tyr Cys Ser Phe Glu ThrArg Arg His Leu 100 105 110 Cys Met Val Met Glu Tyr Val Glu Gly Gly AspCys Ala Thr Leu Met 115 120 125 Lys Asn Met Gly Pro Leu Pro Val Asp MetAla Arg Met Tyr Phe Ala 130 135 140 Glu Thr Val Leu Ala Leu Glu Tyr LeuHis Asn Tyr Gly Ile Val His 145 150 155 160 Arg Asp Leu Lys Pro Asp AsnLeu Leu Val Thr Ser Met Gly His Ile 165 170 175 Lys Leu Thr Asp Phe GlyLeu Ser Lys Val Gly Leu Met Ser Met Thr 180 185 190 Thr Asn Leu Tyr GluGly His Ile Glu Lys Asp Ala Arg Glu Phe Leu 195 200 205 Asp Lys Gln ValCys Gly Thr Pro Glu Tyr Ile Ala Pro Glu Val Ile 210 215 220 Leu Arg GlnGly Tyr Gly Lys Pro Val Asp Trp Trp Ala Met Gly Ile 225 230 235 240 IleLeu Tyr Glu Phe Leu Val Gly Cys Val Pro Phe Phe Gly Asp Thr 245 250 255Pro Glu Glu Leu Phe Gly Gln Val Ile Ser Asp Glu Ile Asn Trp Pro 260 265270 Glu Lys Asp Glu Ala Pro Pro Pro Asp Ala Gln Asp Leu Ile Thr Leu 275280 285 Leu Leu Arg Gln Asn Pro Leu Glu Arg Leu Gly Thr Gly Gly Ala Tyr290 295 300 Glu Val Lys Gln His Arg Phe Phe Arg Ser Leu Asp Trp Asn SerLeu 305 310 315 320 Leu Arg Gln Lys Ala Glu Phe Ile Pro Gln Leu Glu SerGlu Asp Asp 325 330 335 Thr Ser Tyr Phe Asp Thr Arg Ser Glu Lys Tyr HisHis Met Glu Thr 340 345 350 Glu Glu Glu Asp Asp Thr Asn Asp Glu Asp PheAsn Val Glu Ile Arg 355 360 365 Gln Phe Ser Ser Cys Ser His Arg Phe SerLys Val Phe Ser Ser Ile 370 375 380 Asp Arg Ile Thr Gln Asn Ser Ala GluGlu Lys Glu Asp Ser Val Asp 385 390 395 400 Lys Thr Lys Ser Thr Thr LeuPro Ser Thr Glu Thr Leu Ser Trp Ser 405 410 415 Ser Glu Tyr Ser Glu MetGln Gln Leu Ser Thr Ser Asn Ser Ser Asp 420 425 430 Thr Glu Ser Asn ArgHis Lys Leu Ser Ser Gly Leu Leu Pro Lys Leu 435 440 445 Ala Ile Ser ThrGlu Gly Glu Gln Asp Glu Ala Ala Ser Cys Pro Gly 450 455 460 Asp Pro HisGlu Glu Pro Gly Lys Pro Ala Leu Pro Pro Glu Glu Cys 465 470 475 480 AlaGln Glu Glu Pro Glu Val Thr Thr Pro Ala Ser Thr Ile Ser Ser 485 490 495Ser Thr Leu Ser Val Gly Ser Phe Ser Glu His Leu Asp Gln Ile Asn 500 505510 Gly Arg Ser Glu Cys Val Asp Ser Thr Asp Asn Ser Ser Lys Pro Ser 515520 525 Ser Glu Pro Ala Ser His Met Ala Arg Gln Arg Leu Glu Ser Thr Glu530 535 540 Lys Lys Lys Ile Ser Gly Lys Val Thr Lys Ser Leu Ser Ala SerAla 545 550 555 560 Leu Ser Leu Met Ile Pro Gly Asp Met Phe Ala Val SerPro Leu Gly 565 570 575 Ser Pro Met Ser Pro His Ser Leu Ser Ser Asp ProSer Ser Ser Arg 580 585 590 Asp Ser Ser Pro Ser Arg Asp Ser Ser Ala AlaSer Ala Ser Pro His 595 600 605 Gln Pro Ile Val Ile His Ser Ser Gly LysAsn Tyr Gly Phe Thr Ile 610 615 620 Arg Ala Ile Arg Val Tyr Val Gly AspSer Asp Ile Tyr Thr Val His 625 630 635 640 His Ile Val Trp Asn Val GluGlu Gly Ser Pro Ala Cys Gln Ala Gly 645 650 655 Leu Lys Ala Gly Asp LeuIle Thr His Ile Asn Gly Glu Pro Val His 660 665 670 Gly Leu Val His ThrGlu Val Ile Glu Leu Leu Leu Lys Ser Gly Asn 675 680 685 Lys Val Ser IleThr Thr Thr Pro Phe Glu Asn Thr Ser Ile Lys Thr 690 695 700 Gly Pro AlaArg Arg Asn Ser Tyr Lys Ser Arg Met Val Arg Arg Ser 705 710 715 720 LysLys Ser Lys Lys Lys Glu Ser Leu Glu Arg Arg Arg Ser Leu Phe 725 730 735Lys Lys Leu Ala Lys Gln Pro Ser Pro Leu Leu His Thr Ser Arg Ser 740 745750 Phe Ser Cys Leu Asn Arg Ser Leu Ser Ser Gly Glu Ser Leu Pro Gly 755760 765 Ser Pro Thr His Ser Leu Ser Pro Arg Ser Pro Thr Pro Ser Tyr Arg770 775 780 Ser Thr Pro Asp Phe Pro Ser Gly Thr Asn Ser Ser Gln Ser SerSer 785 790 795 800 Pro Ser Ser Ser Ala Pro Asn Ser Pro Ala Gly Ser GlyHis Ile Arg 805 810 815 Pro Ser Thr Leu His Gly Leu Ala Pro Lys Leu GlyGly Gln Arg Tyr 820 825 830 Arg Ser Gly Arg Arg Lys Ser Ala Gly Asn IlePro Leu Ser Pro Leu 835 840 845 Ala Arg Thr Pro Ser Pro Thr Pro Gln ProThr Ser Pro Gln Arg Ser 850 855 860 Pro Ser Pro Leu Leu Gly His Ser LeuGly Asn Ser Lys Ile Ala Gln 865 870 875 880 Ala Phe Pro Ser Lys Met HisSer Pro Pro Thr Ile Val Arg His Ile 885 890 895 Val Arg Pro Lys Ser AlaGlu Pro Pro Arg Ser Pro Leu Leu Lys Arg 900 905 910 Val Gln Ser Glu GluLys Leu Ser Pro Ser Tyr Gly Ser Asp Lys Lys 915 920 925 His Leu Cys SerArg Lys His Ser Leu Glu Val Thr Gln Glu Glu Val 930 935 940 Gln Arg GluGln Ser Gln Arg Glu Ala Pro Leu Gln Ser Leu Asp Glu 945 950 955 960 AsnVal Cys Asp Val Pro Pro Leu Ser Arg Ala Arg Pro Val Glu Gln 965 970 975Gly Cys Leu Lys Arg Pro Val Ser Arg Lys Val Gly Arg Gln Glu Ser 980 985990 Val Asp Asp Leu Asp Arg Asp Lys Leu Lys Ala Lys Val Val Val Lys 9951000 1005 Lys Ala Asp Gly Phe Pro Glu Lys Gln Glu Ser His Gln Lys Ser1010 1015 1020 His Gly Pro Gly Ser Asp Leu Glu Asn Phe Ala Leu Phe LysLeu 1025 1030 1035 Glu Glu Arg Glu Lys Lys Val Tyr Pro Lys Ala Val GluArg Ser 1040 1045 1050 Ser Thr Phe Glu Asn Lys Ala Ser Met Gln Glu AlaPro Pro Leu 1055 1060 1065 Gly Ser Leu Leu Lys Asp Ala Leu His Lys GlnAla Ser Val Arg 1070 1075 1080 Ala Ser Glu Gly Ala Met Ser Asp Gly ProVal Pro Ala Glu His 1085 1090 1095 Arg Gln Gly Gly Gly Asp Phe Arg ArgAla Pro Ala Pro Gly Thr 1100 1105 1110 Leu Gln Asp Gly Leu Cys His SerLeu Asp Arg Gly Ile Ser Gly 1115 1120 1125 Lys Gly Glu Gly Thr Glu LysSer Ser Gln Ala Lys Glu Leu Leu 1130 1135 1140 Arg Cys Glu Lys Leu AspSer Lys Leu Ala Asn Ile Asp Tyr Leu 1145 1150 1155 Arg Lys Lys Met SerLeu Glu Asp Lys Glu Asp Asn Leu Cys Pro 1160 1165 1170 Val Leu Lys ProLys Met Thr Ala Gly Ser His Glu Cys Leu Pro 1175 1180 1185 Gly Asn ProVal Arg Pro Thr Gly Gly Gln Gln Glu Pro Pro Pro 1190 1195 1200 Ala SerGlu Ser Arg Ala Phe Val Ser Ser Thr His Ala Ala Gln 1205 1210 1215 MetSer Ala Val Ser Phe Val Pro Leu Lys Ala Leu Thr Gly Arg 1220 1225 1230Val Asp Ser Gly Thr Glu Lys Pro Gly Leu Val Ala Pro Glu Ser 1235 12401245 Pro Val Arg Lys Ser Pro Ser Glu Tyr Lys Leu Glu Gly Arg Ser 12501255 1260 Val Ser Cys Leu Lys Pro Ile Glu Gly Thr Leu Asp Ile Ala Leu1265 1270 1275 Leu Ser Gly Pro Gln Ala Ser Lys Thr Glu Leu Pro Ser ProGlu 1280 1285 1290 Ser Ala Gln Ser Pro Ser Pro Ser Gly Asp Val Arg AlaSer Val 1295 1300 1305 Pro Pro Val Leu Pro Ser Ser Ser Gly Lys Lys AsnAsp Thr Thr 1310 1315 1320 Ser Ala Arg Glu Leu Ser Pro Ser Ser Leu LysMet Asn Lys Ser 1325 1330 1335 Tyr Leu Leu Glu Pro Trp Phe Leu Pro ProSer Arg Gly Leu Gln 1340 1345 1350 Asn Ser Pro Ala Val Ser Leu Pro AspPro Glu Phe Lys Arg Asp 1355 1360 1365 Arg Lys Gly Pro His Pro Thr AlaArg Ser Pro Gly Thr Val Met 1370 1375 1380 Glu Ser Asn Pro Gln Gln ArgGlu Gly Ser Ser Pro Lys His Gln 1385 1390 1395 Asp His Thr Thr Asp ProLys Leu Leu Thr Cys Leu Gly Gln Asn 1400 1405 1410 Leu His Ser Pro AspLeu Ala Arg Pro Arg Cys Pro Leu Pro Pro 1415 1420 1425 Glu Ala Ser ProSer Arg Glu Lys Pro Gly Leu Arg Glu Ser Ser 1430 1435 1440 Glu Arg GlyPro Pro Thr Ala Arg Ser Glu Arg Ser Ala Ala Arg 1445 1450 1455 Ala AspThr Cys Arg Glu Pro Ser Met Glu Leu Cys Phe Pro Glu 1460 1465 1470 ThrAla Lys Thr Ser Asp Asn Ser Lys Asn Leu Leu Ser Val Gly 1475 1480 1485Arg Thr His Pro Asp Phe Tyr Thr Gln Thr Gln Ala Met Glu Lys 1490 14951500 Ala Trp Ala Pro Gly Gly Lys Thr Asn His Lys Asp Gly Pro Gly 15051510 1515 Glu Ala Arg Pro Pro Pro Arg Asp Asn Ser Ser Leu His Ser Ala1520 1525 1530 Gly Ile Pro Cys Glu Lys Glu Leu Gly Lys Val Arg Arg GlyVal 1535 1540 1545 Glu Pro Lys Pro Glu Ala Leu Leu Ala Arg Arg Ser LeuGln Pro 1550 1555 1560 Pro Gly Ile Glu Ser Glu Lys Ser Glu Lys Leu SerSer Phe Pro 1565 1570 1575 Ser Leu Gln Lys Asp Gly Ala Lys Glu Pro GluArg Lys Glu Gln 1580 1585 1590 Pro Leu Gln Arg His Pro Ser Ser Ile ProPro Pro Pro Leu Thr 1595 1600 1605 Ala Lys Asp Leu Ser Ser Pro Ala AlaArg Gln His Cys Ser Ser 1610 1615 1620 Pro Ser His Ala Ser Gly Arg GluPro Gly Ala Lys Pro Ser Thr 1625 1630 1635 Ala Glu Pro Ser Ser Ser ProGln Asp Pro Pro Lys Pro Val Ala 1640 1645 1650 Ala His Ser Glu Ser SerSer His Lys Pro Arg Pro Gly Pro Asp 1655 1660 1665 Pro Gly Pro Pro LysThr Lys His Pro Asp Arg Ser Leu Ser Ser 1670 1675 1680 Gln Lys Pro SerVal Gly Ala Thr Lys Gly Lys Glu Pro Ala Thr 1685 1690 1695 Gln Ser LeuGly Gly Ser Ser Arg Glu Gly Lys Gly His Ser Lys 1700 1705 1710 Ser GlyPro Asp Val Phe Pro Ala Thr Pro Gly Ser Gln Asn Lys 1715 1720 1725 AlaSer Asp Gly Ile Gly Gln Gly Glu Gly Gly Pro Ser Val Pro 1730 1735 1740Leu His Thr Asp Arg Ala Pro Leu Asp Ala Lys Pro Gln Pro Thr 1745 17501755 Ser Gly Gly Arg Pro Leu Glu Val Leu Glu Lys Pro Val His Leu 17601765 1770 Pro Arg Pro Gly His Pro Gly Pro Ser Glu Pro Ala Asp Gln Lys1775 1780 1785 Leu Ser Ala Val Gly Glu Lys Gln Thr Leu Ser Pro Lys HisPro 1790 1795 1800 Lys Pro Ser Thr Val Lys Asp Cys Pro Thr Leu Cys LysGln Thr 1805 1810 1815 Asp Asn Arg Gln Thr Asp Lys Ser Pro Ser Gln ProAla Ala Asn 1820 1825 1830 Thr Asp Arg Arg Ala Glu Gly Lys Lys Cys ThrGlu Ala Leu Tyr 1835 1840 1845 Ala Pro Ala Glu Gly Asp Lys Leu Glu AlaGly Leu Ser Phe Val 1850 1855 1860 His Ser Glu Asn Arg Leu Lys Gly AlaGlu Arg Pro Ala Ala Gly 1865 1870 1875 Val Gly Lys Gly Phe Pro Glu AlaArg Gly Lys Gly Pro Gly Pro 1880 1885 1890 Gln Lys Pro Pro Thr Glu AlaAsp Lys Pro Asn Gly Met Lys Arg 1895 1900 1905 Ser Pro Ser Ala Thr GlyGln Ser Ser Phe Arg Ser Thr Ala Leu 1910 1915 1920 Pro Glu Lys Ser LeuSer Cys Ser Ser Ser Phe Pro Glu Thr Arg 1925 1930 1935 Ala Gly Val ArgGlu Ala Ser Ala Ala Ser Ser Asp Thr Ser Ser 1940 1945 1950 Ala Lys AlaAla Gly Gly Met Leu Glu Leu Pro Ala Pro Ser Asn 1955 1960 1965 Arg AspHis Arg Lys Ala Gln Pro Ala Gly Glu Gly Arg Thr His 1970 1975 1980 MetThr Lys Ser Asp Ser Leu Pro Ser Phe Arg Val Ser Thr Leu 1985 1990 1995Pro Leu Glu Ser His His Pro Asp Pro Asn Thr Met Gly Gly Ala 2000 20052010 Ser His Arg Asp Arg Ala Leu Ser Val Thr Ala Thr Val Gly Glu 20152020 2025 Thr Lys Gly Lys Asp Pro Ala Pro Ala Gln Pro Pro Pro Ala Arg2030 2035 2040 Lys Gln Asn Val Gly Arg Asp Val Thr Lys Pro Ser Pro AlaPro 2045 2050 2055 Asn Thr Asp Arg Pro Ile Ser Leu Ser Asn Glu Lys AspPhe Val 2060 2065 2070 Val Arg Gln Arg Arg Gly Lys Glu Ser Leu Arg SerSer Pro His 2075 2080 2085 Lys Lys Ala Leu 2090

What is claimed is:
 1. A method of identifying a candidate beta-cateninpathway modulating agent, said method comprising the steps of: a)providing an assay system comprising a MBCAT polypeptide or nucleicacid; b) contacting the assay system with a test agent under conditionswhereby, but for the presence of the test agent, the system provides areference activity; and c) detecting a test agent-biased activity of theassay system, wherein a difference between the test agent-biasedactivity and the reference activity identifies the test agent as acandidate beta-catenin pathway modulating agent.
 2. The method of claim1 wherein the assay system comprises cultured cells that express theMBCAT polypeptide.
 3. The method of claim 2 wherein the cultured cellsadditionally have defective beta-catenin function.
 4. The method ofclaim 1 wherein the assay system includes a screening assay comprising aMBCAT polypeptide, and the candidate test agent is a small moleculemodulator.
 5. The method of claim 4 wherein the assay is a kinase assay.6. The method of claim 1 wherein the assay system is selected from thegroup consisting of an apoptosis assay system, a cell proliferationassay system, an angiogenesis assay system, and a hypoxic inductionassay system.
 7. The method of claim 1 wherein the assay system includesa binding assay comprising a MBCAT polypeptide and the candidate testagent is an antibody.
 8. The method of claim 1 wherein the assay systemincludes an expression assay comprising a MBCAT nucleic acid and thecandidate test agent is a nucleic acid modulator.
 9. The method of claim8 wherein the nucleic acid modulator is an antisense oligomer.
 10. Themethod of claim 8 wherein the nucleic acid modulator is a PMO.
 11. Themethod of claim 1 additionally comprising: d) administering thecandidate beta-catenin pathway modulating agent identified in (c) to amodel system comprising cells defective in beta-catenin function and,detecting a phenotypic change in the model system that indicates thatthe beta-catenin function is restored.
 12. The method of claim 11wherein the model system is a mouse model with defective beta-cateninfunction.
 13. A method for modulating a beta-catenin pathway of a cellcomprising contacting a cell defective in beta-catenin function with acandidate modulator that specifically binds to a MBCAT polypeptidecomprising an amino acid sequence selected from group consisting of SEQID NO: 14, 15, 16, 17 and 18, whereby beta-catenin function is restored.14. The method of claim 13 wherein the candidate modulator isadministered to a vertebrate animal predetermined to have a disease ordisorder resulting from a defect in beta-catenin function.
 15. Themethod of claim 13 wherein the candidate modulator is selected from thegroup consisting of an antibody and a small molecule.
 16. The method ofclaim 1, comprising the additional steps of: e) providing a secondaryassay system comprising cultured cells or a non-human animal expressingMBCAT, f) contacting the secondary assay system with the test agent of(b) or an agent derived therefrom under conditions whereby, but for thepresence of the test agent or agent derived therefrom, the systemprovides a reference activity; and g) detecting an agent-biased activityof the second assay system, wherein a difference between theagent-biased activity and the reference activity of the second assaysystem confirms the test agent or agent derived therefrom as a candidatebeta-catenin pathway modulating agent, and wherein the second assaydetects an agent-biased change in the beta-catenin pathway.
 17. Themethod of claim 16 wherein the secondary assay system comprises culturedcells.
 18. The method of claim 16 wherein the secondary assay systemcomprises a non-human animal.
 19. The method of claim 18 wherein thenon-human animal mis-expresses a beta-catenin pathway gene.
 20. A methodof modulating beta-catenin pathway in a mammalian cell comprisingcontacting the cell with an agent that specifically binds a MBCATpolypeptide or nucleic acid.
 21. The method of claim 20 wherein theagent is administered to a mammalian animal predetermined to have apathology associated with the beta-catenin pathway.
 22. The method ofclaim 20 wherein the agent is a small molecule modulator, a nucleic acidmodulator, or an antibody.
 23. A method for diagnosing a disease in apatient comprising: a) obtaining a biological sample from the patient;b) contacting the sample with a probe for MBCAT expression; c) comparingresults from step (b) with a control; d) determining whether step (c)indicates a likelihood of disease.
 24. The method of claim 23 whereinsaid disease is cancer.
 25. The method according to claim 24, whereinsaid cancer is a cancer as shown in Table 1 as having >25% expressionlevel.